2
$\begingroup$

I have a list of $K>n$ points in $\mathbb{R}^n$, each one known to be at distance less than $\sqrt n$ of an unknown hyperplane of dimension $n-1$.

At which extent is it possible to recover approximations of the unknown hyperplane? I.e. how to interpolate an hyperplane between the $K$ points? A reference to an algorithm will be much appreciated.

  • 0
    All the usual methods are available for approximation -- a least squares fit being the first that comes to mind. I would refer to the result as a *fitting* rather than as an *interpolation* as the latter usually implies an exact fit. – 2017-02-23
  • 1
    Good comment, post this as an answer – 2017-02-23

1 Answers 1

3

The Question here is about fitting a hyperplane to a sufficiently large number of points in $\mathbb{R}^n$. We assume that distance of points to the hyperplane in this context means the minimum Euclidean distance.

This is in one sense a very simple problem, but if one takes into consideration what uncertainty may cause the points not to lie on a common hyperplane, there are interesting variants of the problem.

It was given that each of $K$ points in $\mathbb{R}^n$ are "known to be at distance less than $\sqrt{n}$ of an unknown hyperplane of dimension $nāˆ’1$." By itself this is not a strong constraint of the possible hyperplane solutions. For example, if all the points are within the hypercube $[-1,1]^n$, then their distances to the origin are each less than $\sqrt{n}$. It follows that any hyperplane passing through the origin (i.e. a subspace of dimension $n-1$) would automatically satisfy the $\sqrt{n}$ distance constraint.

If the points are widely separated, however, then fitting a hyperplane so that it passes through the Euclidean balls of radius $\sqrt{n}$ surrounding each point gets to be a more exacting challenge.

The minimization of the sum of squares of "errors" (perpendicular distances from points to the unknown hyperplane) is achieved by principal components analysis, a solution discovered by Karl Pearson in 1901.

A brief sketch of this method follows:

Find the average (arithmetic mean) $\vec z$ of all your $K$ points $\vec{z_i} \in \mathbb{R}^n$, and subtract this average from each point to get a revised collection of points "centered" on the origin, $\vec{x_i} = \vec{z_i} - \vec z$. Somewhat confusingly, we will use $x_1,x_2,\ldots,x_n$ (without the vector decoration) to label coordinates of $\mathbb{R}^n$.

Form a $K\times n$ matrix $X$ having these revised points as rows. Note that as a result of their construction, each column has mean zero.

Take the real singular value decomposition of $X$:

$$ X = U \Sigma V^T $$

Here $U,V$ are orthogonal matrices of resp. sizes $K\times K$ and $n\times n$, and $\Sigma$ is a rectangular diagonal matrix with non-negative entries, arranged in descending order down the diagonal.

Since $XV = U \Sigma$ by the orthogonality of $V$, the multiplication of $X$ on the right by $V$ puts the revised (mean zero) rows of $X$ into positions close to the hyperplane where $x_n = 0$. To see this, consider that the columns of $\Sigma$ have non-negative diagonal entries, and the last of these $\sigma_n$ is smallest. Since $U$ is an orthogonal matrix, it has $K$ columns, each of unit Euclidean length. Thus the final column of $\Sigma$ gives the coordinate of $XV$ whose sum of squares is closest to zero.

Because the hyperplane $x_n = 0$ is the best fit to rows $XV$, we can work backwards to the affine plane which is the best fit to the original points: we right multiply the subspace $\{x_n = 0\}$ by $V^T$ and then add the offset of the average of the original points back again. These are "rigid" transformations (distance preserving), and so we get back to the (affine) hyperplane which best fits the original points in a least sum of squares sense.

Explicitly, if $\vec v_n$ is the final column of $V$, then our fitted hyperplane has the linear equation:

$$ ((x_1,x_2,\ldots,x_n) - \vec z)\cdot {\vec v_n}^T = 0 $$

Note that the final singular value $\sigma_n = 0$ if and only if the hyperplane fit is exact. The best least squares hyperplane fit is unique if and only if the final singular value $\sigma_n$ is strictly less than the others.

  • 1
    Thanks for all these details. I should have been more precise, the $\sqrt n$ constraint is because the $n$ points are in $\mathbb{Z}^n$, not $\mathbb{R}^n$ and as you stated the largest distance between twe neighbor integers points in $\mathbb{R}^n$ is $\sqrt n$. (This does not harm your answer, of course) – 2017-02-28