Nothing more to explain. I just don't know how to find the best fitting plane given a set of $N$ points in a $3D$ space. I then have to write the corresponding algorithm. Thank you ;)
Best Fitting Plane given a Set of Points
-
0...Instead of just adding up the lengths of the line segments, you could add up the squares of the lengths - may seem like a strange idea, but it's very often a good one in this kind of problem. So you have all those choices, just for drawing a line in 2 dimensions; there are even more choices for a plane in 3. That's why you really have to know what someone means when they ask you to fit a plane to some points. – 2012-01-16
5 Answers
In order to complete the Claude Leibovici's answer :
With the numerical example proposed by Claude Leibovici who computed the parameters of a fitted plane $\quad z=Ax+By\quad$, the fitting of the plane $\quad Z=\alpha X+\beta Y+\gamma\quad$ can be carried out thanks to the principal components method (as suggested by joriki).
The theory can be found in many books. A synopsis is given pages 24-25 in the paper: https://fr.scribd.com/doc/31477970/Regressions-et-trajectoires-3D
The symbols used below correspond to those in the formulas from the referenced paper.
A simple least squares solution should do the trick.
The equation for a plane is: $ax + by + c = z$. So set up matrices like this with all your data:
$ \begin{bmatrix} x_0 & y_0 & 1 \\ x_1 & y_1 & 1 \\ &... \\ x_n & y_n & 1 \\ \end{bmatrix} \begin{bmatrix} a \\ b \\ c \end{bmatrix} = \begin{bmatrix} z_0 \\ z_1 \\ ... \\ z_n \end{bmatrix} $ In other words:
$Ax = B$
Now solve for $x$ which are your coefficients. But since (I assume) you have more than 3 points, the system is over-determined so you need to use the left pseudo inverse: $A^+ = (A^T A)^{-1} A^T$. So the answer is: $ \begin{bmatrix} a \\ b \\ c \end{bmatrix} = (A^T A)^{-1} A^T B $
And here is some simple Python code with an example:
import matplotlib.pyplot as plt from mpl_toolkits.mplot3d import Axes3D import numpy as np N_POINTS = 10 TARGET_X_SLOPE = 2 TARGET_y_SLOPE = 3 TARGET_OFFSET = 5 EXTENTS = 5 NOISE = 5 # create random data xs = [np.random.uniform(2*EXTENTS)-EXTENTS for i in range(N_POINTS)] ys = [np.random.uniform(2*EXTENTS)-EXTENTS for i in range(N_POINTS)] zs = [] for i in range(N_POINTS): zs.append(xs[i]*TARGET_X_SLOPE + \ ys[i]*TARGET_y_SLOPE + \ TARGET_OFFSET + np.random.normal(scale=NOISE)) # plot raw data plt.figure() ax = plt.subplot(111, projection='3d') ax.scatter(xs, ys, zs, color='b') # do fit tmp_A = [] tmp_b = [] for i in range(len(xs)): tmp_A.append([xs[i], ys[i], 1]) tmp_b.append(zs[i]) b = np.matrix(tmp_b).T A = np.matrix(tmp_A) fit = (A.T * A).I * A.T * b errors = b - A * fit residual = np.linalg.norm(errors) print "solution:" print "%f x + %f y + %f = z" % (fit[0], fit[1], fit[2]) print "errors:" print errors print "residual:" print residual # plot plane xlim = ax.get_xlim() ylim = ax.get_ylim() X,Y = np.meshgrid(np.arange(xlim[0], xlim[1]), np.arange(ylim[0], ylim[1])) Z = np.zeros(X.shape) for r in range(X.shape[0]): for c in range(X.shape[1]): Z[r,c] = fit[0] * X[r,c] + fit[1] * Y[r,c] + fit[2] ax.plot_wireframe(X,Y,Z, color='k') ax.set_xlabel('x') ax.set_ylabel('y') ax.set_zlabel('z') plt.show()
-
0Thank you very much for clarifying this Ben (and for the answer) If you have the time, I have a very similar question [here](https://stackoverflow.com/questions/47208838/best-fit-plane-to-3d-data-different-results-for-two-different-methods) (using Singular-Value Decomposition) that is giving me trouble. – 2017-11-09
Subtract out the centroid, form a $3\times N$ matrix $\mathbf X$ out of the resulting coordinates and calculate its singular value decomposition. The normal vector of the best-fitting plane is the left singular vector corresponding to the least singular value. See this answer for an explanation why this is numerically preferable to calculating the eigenvector of $\mathbf X\mathbf X^\top$ corresponding to the least eigenvalue.
Here’s a Python implementation, as requested:
import numpy as n # generate some random test points m = 20 # number of points delta = 0.01 # size of random displacement origin = n.random.rand(3,1) # random origin for the plane basis = n.random.rand(3,2) # random basis vectors for the plane coefficients = n.random.rand(2,m) # random coefficients for points on the plane points = n.dot(basis,coefficients) + n.dot(origin,n.full((1,m),1)) + delta * n.random.rand(3,m) # generate random points on the plane and add random displacement # now find the best-fitting plane for the test points points = n.transpose(n.transpose(points) - n.sum(points,1) / m) # subtract out the centroid svd = n.transpose(n.linalg.svd(points)) # singular value decomposition svd [1] [2] # least singular value n.transpose(svd [0]) [2] # the corresponding left singular vector is the normal vector of the best-fitting plane n.dot(n.transpose(svd [0]),basis) [2] # its dot product with the basis vectors of the plane is approximately zero
Considering a plane of equation $Ax+By+Cz=0$ and a point of coordinates $(x_i,y_i,z_i)$, the distance is given by $d_i=\pm\frac{Ax_i+By_i+Cz_i}{\sqrt{A^2+B^2+C^2}}$ and I suppose that you want to minimize $F=\sum_{i=1}^n d_i^2=\sum_{i=1}^n\frac{(Ax_i+By_i+Cz_i)^2}{{A^2+B^2+C^2}}$ Setting $C=1$, we then need to minimize with respect to $A,B$ $F=\sum_{i=1}^n\frac{(Ax_i+By_i+z_i)^2}{{A^2+B^2+1}}$ Taking derivatives $F'_A=\sum _{i=1}^n \left(\frac{2 x_i (A x_i+B y_i+z_i)}{A^2+B^2+1}-\frac{2 A (A x_i+B y_i+z_i)^2}{\left(A^2+B^2+1\right)^2}\right)$ $F'_B=\sum _{i=1}^n \left(\frac{2 y_i (A x_i+B y_i+z_i)}{A^2+B^2+1}-\frac{2 B (A x_i+B y_i+z_i)^2}{\left(A^2+B^2+1\right)^2}\right)$ Since we shall set these derivatives equal to $0$, the equations can be simplified to $\sum _{i=1}^n \left({ x_i (A x_i+B y_i+z_i)}-\frac{ A (A x_i+B y_i+z_i)^2}{\left(A^2+B^2+1\right)}\right)=0$ $\sum _{i=1}^n \left({ y_i (A x_i+B y_i+z_i)}-\frac{ B (A x_i+B y_i+z_i)^2}{\left(A^2+B^2+1\right)}\right)=0$ whic are nonlinear with respect to the parameters $A,B$; then, good estimates are required since you will probably use Newton-Raphson for polishing the solutions.
These can be obtained making first a multilinear regression (with no intercept in your cas) $z=\alpha x+\beta y$ and use $A=-\alpha$ and $B=-\beta$ for starting the iterative process. The values are given by $A=-\frac{\text{Sxy} \,\text{Syz}-\text{Sxz}\, \text{Syy}}{\text{Sxy}^2-\text{Sxx}\, \text{Syy}}\qquad B=-\frac{\text{Sxy}\, \text{Sxz}-\text{Sxx}\, \text{Syz}}{\text{Sxy}^2-\text{Sxx}\, \text{Syy}}$
For illustration purposes, let me consider the following data $\left( \begin{array}{ccc} x & y & z \\ 1 & 1 & 9 \\ 1 & 2 & 14 \\ 1 & 3 & 20 \\ 2 & 1 & 11 \\ 2 & 2 & 17 \\ 2 & 3 & 23 \\ 3 & 1 & 15 \\ 3 & 2 & 20 \\ 3 & 3 & 26 \end{array} \right)$
The preliminary step gives $z=2.97436 x+5.64103 y$ and solving the rigorous equations converges to $A=-2.97075$, $B=-5.64702$ which are quite close to the estimates (because of small errors).
-
0In your example, the equation of the plane $z=Ax+By$ is carried out with respect to the criteria of the least square orthogonal distances between the points and the plane. I agree with your result. But, they are only two adjustable parameters $A,B$ because the origin $(0,0,0)$ is supposed to be on the plan. This is an additional condition. If we consider the more general equation of the plan $z=Ax+By+C$ the three parameters problem should lead to a better fitting with lower mean square deviation. – 2016-02-12