4
$\begingroup$

We are using the OpenCV library estimateRigidTransform function to find a mapping between two 2D point sets. The mapping supports rotation, uniform scaling, and translation.

$$T=\left[ \begin{array}{cc|c} s\cos(\theta)&-s\sin(\theta)&T_x\\ s\sin(\theta)&s\cos(\theta)&T_y \end{array} \right] $$

The documentation claims the function solves the following problem.

$$[A^*|b^*] = arg \min _{[A|b]} \sum _i \| \texttt{dst}[i] - A { \texttt{src}[i]}^T - b \| ^2$$ Here the left hand side is our transform $T$ and $dst$ and $src$ are the two point sets. It appears to find the transform that minimizes the error (distance) between the $dst$ point set and the $src$ point set after transformation.

The over defined problem of $N$ equations is converted to a system of 4 equations and 4 unknowns ($s\cos(\theta)$, $s\sin(\theta)$, $T_x$, $T_y$) as shown below. $(a_x,a_y)$ is a point in the first point set $A$, and $(b_x,b_y)$ is a point in the second point set $B$. Both point sets contain $N$ points. The points have already been matched between the two sets, so the first point in set $A$ should map to the first point in set $B$ and so on.

All the summations occur over the set of $N$ point pairs.

$$\left[ \begin{array}{cccc} \sum_{} (a_x^2+a_y^2)&0&\sum_{} a_x&\sum_{} a_y\\ 0&\sum_{} (a_x^2+a_y^2)&-\sum_{} a_y&\sum_{} a_x\\ \sum_{} a_x&-\sum_{} a_y&N&0\\ \sum_{} a_y&\sum_{} a_x&0&N\\ \end{array} \right]\left[ \begin{array}{c} s\cos(\theta)\\ s\sin(\theta)\\ T_x\\ T_y\\ \end{array} \right]=\left[ \begin{array}{c} \sum_{} (a_xb_x+a_yb_y)\\ \sum_{} (a_xb_y-a_yb_x)\\ \sum_{} b_x\\ \sum_{} b_y\\ \end{array} \right] $$

I got these equations from reverse engineering the source code. The documentation then says that eigenvalue decomposition is used to solve the system. I am trying to determine where this set of equations has come from. The first and second rows are a mystery to me. I recognize the third and fourth rows as summing both sides of the mapping equation for $x$ and $y$.

$$s\cos(\theta)\sum_{} a_x-s\sin(\theta)\sum_{} a_y+NT_x=\sum_{} b_x$$ $$\sum_{} (a_xs\cos(\theta)-a_ys\sin(\theta)+T_x)=\sum_{} b_x$$ and $$s\cos(\theta)\sum_{} a_y+s\sin(\theta)\sum_{} a_x+NT_y=\sum_{} b_y$$ $$\sum_{} (a_xs\sin(\theta)+a_ys\cos(\theta)+T_y)=\sum_{} b_y$$ Does anyone know where the first and second rows come from? I assume OpenCV is implementing some form of least squares fitting to find the transform that best maps the points.

EDIT:

After playing around further, I now recognize the first row as: $$a_x*row3+a_y*row4$$ And the second row as: $$-a_y*row3+a_x*row4$$ The first two rows are created by combining the last 2 in different ways. Doesn't this create of set of equations that are not independent any more? Or does the summations in the equations and the residual error somehow maintain their independence?

1 Answers 1

2

I assume OpenCV is implementing some form of least squares fitting to find the transform that best maps the points.

You are right, and this idea is at the core of what's going on here. You essentially want

$$\begin{bmatrix} a_x & -a_y & 1 & 0 \\ a_y & a_x & 0 & 1 \\ \vdots & \vdots & \vdots & \vdots \end{bmatrix}\cdot\begin{bmatrix} s\cos\theta \\ s\sin\theta \\ T_x \\ T_y \end{bmatrix}\approx\begin{bmatrix} b_x \\ b_y \\ \vdots \end{bmatrix}$$

This is an overdetermined system of equations, with two equations for each pair of points, $2N$ equations in total. In order to solve this using linear least squares approximation, you multiply both sides with the transpose of the left matrix.

$$ \begin{bmatrix} a_x & a_y & \cdots \\ -a_y & a_x & \cdots \\ 1 & 0 & \cdots \\ 0 & 1 & \cdots \end{bmatrix}\cdot \begin{bmatrix} a_x & -a_y & 1 & 0 \\ a_y & a_x & 0 & 1 \\ \vdots & \vdots & \vdots & \vdots \end{bmatrix}= \begin{bmatrix} \sum(a_x^2+a_y^2) & 0 & \sum a_x & \sum a_y \\ 0 & \sum(a_x^2+a_y^2) & -\sum a_y & \sum a_x \\ \sum a_x & -\sum a_y & N & 0 \\ \sum a_y & \sum a_x & 0 & N \end{bmatrix} $$

for the left hand side and

$$ \begin{bmatrix} a_x & a_y & \cdots \\ -a_y & a_x & \cdots \\ 1 & 0 & \cdots \\ 0 & 1 & \cdots \end{bmatrix}\cdot \begin{bmatrix} b_x \\ b_y \\ \vdots \end{bmatrix}= \begin{bmatrix} \sum(a_xb_x+a_yb_y) \\ \sum(a_xb_y-a_yb_x) \\ \sum b_x \\ \sum b_y \end{bmatrix} $$

for the right hand side. If you now were to solve this using the inverse of the left hand side matrix, you'd essentially be using a Moore-Penrose pseudoinverse, but usually there are easier ways to solve a system of equations, e.g. using Gaussian elimination or the eigenvectors you mentioned. So solving the modified system of equations will give you the solution to your least squares problem.

I hope the notation was reasonably clear. Keep in mind that all the rows resp. columns come in pairs of two, and I only included the first such pair.

  • 2
    +1 Nice. I'll just mention the system is often called the [normal equations](https://en.wikipedia.org/wiki/Linear_least_squares_(mathematics)#Derivation_of_the_normal_equations) of the problem (written $ A^TAx=A^Tb $).2017-02-10
  • 0
    Thanks for the clear explanation.2017-02-10