I might be misinterpreting your question, but here is one approach (which Leonid Kovalev discussed a bit above). Let's say we have some samples of a function $f: \mathbb{R}^n \to \mathbb{R}$ at some collection of points $X = \{x_1,...,x_N\}$ and our goal is to approximate $f$ with this limited information using some collection of functions $\varphi_k(x)$. One approach is to construct a linear combination $P_f = \sum_{j=1}^N c_k \varphi_k(x)$ and impose the condition that $f(x_i) = P_f(x_i) = \sum_{k=1}^N c_k \varphi(x_i)$. This leads to a matrix system$Ac = y$, where the $c$ vector is our coefficient vectors and the $y$ vector is the collection of samples $f(x_i)$ of our function we want to approximate. The matrix $A$ has entries of the form $A_{ij} = \varphi_i(x_j)$ which we can attempt to solve. If our function we are interested in interpolating is a function of one variable, then we know we can always interpolate at the $N$ points given by using a polynomial. It's not so simple in the case when working in $\mathbb R^n$. To further simplify things, we usually choose a function $\phi(r) :\mathbb{R} \to \mathbb{R}$ and then form our functions $\varphi_k : \mathbb{R}^n \to \mathbb{R}$ by setting $\varphi_k(x) = \phi(||x-x_k||)$, which is a univariate function and much easier to use. We then construct our interpolant by $P_f(x) = \sum_{k=1}^N c_k \varphi(x-x_k) = \sum_{k=1}^N c_k \phi(||x-x_k||)$. We say such functions are radial if we can write them as $\varphi(x) = \phi(||x||)$. A good example that is common and in use is to choose $\varphi_k(x) = e^{-(\epsilon||x-x_k||)^2}$, which is a Gaussian function. When we use functions of the type $\varphi(||x-x_k||)$, the matrix system is now of the form $A_{ij} = \varphi(||x_i-x_j||)$, which leads to a symmetric matrix.
However, this approach I mentioned above has a potential flaw: there is no reason to assume the matrix $A$ will be invertible! So, now the question is: what types of functions $\{\varphi_k(x)\}$ allow us to construct an invertible matrix? If we impose the condition that for any collection of data points $\{x_1,...,x_N\}$ and for any vector $c \in \mathbb{C}^N$ $\sum_{j=1}^N \sum_{k=1}^N c_j \overline{c_k} \varphi(||x_j-x_k||) > 0$, we are forcing $A$ to be a positive definite matrix. For this reason we call such functions positive definite (although some in the literature call them strictly positive definite, so beware). A positive definite matrix will be invertible, so our problem is well posed and solvable now. The question then becomes about error bounds and computation time. How well do these work?
They work quite well actually, depending on the functions you choose. Gaussians are very popular and have a special property known as spectral convergence, which roughly means that if you choose your data centers $\{x_1,...,x_N\}$ well enough and with enough density, $||P_f-f||_{L^{\infty}(\Omega)} \leq e^{-\alpha}K_{f}$, where $\alpha$ is a bunch of terms which depend on the fill distance and $K_f$ is a constant which depends on $f$. So, if you choose enough points, you can get very good approximation very quickly. If you recall with Gaussians above, I included an $\epsilon$ in the exponent. This is a parameter which you have to choose, and a poor choice of parameter can lead to some trouble. I'll attach an example I ran in MATLAB. I considered $f(x,y) = (x+y)/2$ and chose 1089 spaced points (using a Halton sequence) and I approximated $f$. The figure is the function $P_f(x) = \sum_{j=1}^{1089} c_j e^{(-\epsilon ||x-x_j||)^2}$, and the colormap shows the error between $P_f$ and $f$.
f(x,y) = (x+y)/2">
Besides Guassians, there are several other well known of positive definite functions, and also another class called conditionally positive definite functions which allow for polynomial reproduction. Such an example is the thin plate spline, which is a radial function defined from the univariate function $\phi(r) = r^2 \log(r)$. These, along with other positive and conditionally positive definite functions are being investigated currently and work quite well for some applications. Applications of these techniques include simple interpolation (which I mostly described), meshless methods for PDE solutions, and scattered data approximation including 3D surface reconstruction. Another direction is to consider approximating functions on Riemannian manifolds, which is another active investigation area.
One last topic: it might not be clear why we shouldn't just use multivariate polynomials, since polynomials work so well for $\mathbb{R}$. The Mairhuber-Curtis Theorem unfortunately tells us that we can't just choose some collection of functions (e.g., polynomials) in advance independent of our data sites to get a well posed problem in $\mathbb{R}^n$ for $n > 1$. See one of the books below for a concrete description and discussion of this theorem.
If you're interested and want actual details, check out the book "Scattered Data Approximation" by Holger Wendland or "Meshfree Approximation Methods" by Gregory Fasshauer. The former contains details and proofs ofmajor theorems, while the latter provides a survey of the topics and an excellent discussion of numerical methods using MATLAB (which helped me make the above plot). If you're interested in papers, look for papers by H. Wendland, F.J. Narcowich, J. D. Ward, R. Schabak, or E. J. Kansa. S. Bochner found the celebrated Bochner's theorem which characterizes positive semi definite functions in terms of Fourier transforms of non-negative finite Borel measures. This was a first theoretical result which has proven to be very useful, although I do not belive Bochner had any interest in approximation. On the error bound aspect, the concept of Reproducing Kernel Hilbert spaces are key, which was first described by N. Aronszajn.
I hope this proves to be relevant to your question and helpful!