We are using the OpenCV library estimateRigidTransform function to find a mapping between two 2D point sets. The mapping supports rotation, uniform scaling, and translation.
$$T=\left[ \begin{array}{cc|c} s\cos(\theta)&-s\sin(\theta)&T_x\\ s\sin(\theta)&s\cos(\theta)&T_y \end{array} \right] $$
The documentation claims the function solves the following problem.
$$[A^*|b^*] = arg \min _{[A|b]} \sum _i \| \texttt{dst}[i] - A { \texttt{src}[i]}^T - b \| ^2$$ Here the left hand side is our transform $T$ and $dst$ and $src$ are the two point sets. It appears to find the transform that minimizes the error (distance) between the $dst$ point set and the $src$ point set after transformation.
The over defined problem of $N$ equations is converted to a system of 4 equations and 4 unknowns ($s\cos(\theta)$, $s\sin(\theta)$, $T_x$, $T_y$) as shown below. $(a_x,a_y)$ is a point in the first point set $A$, and $(b_x,b_y)$ is a point in the second point set $B$. Both point sets contain $N$ points. The points have already been matched between the two sets, so the first point in set $A$ should map to the first point in set $B$ and so on.
All the summations occur over the set of $N$ point pairs.
$$\left[ \begin{array}{cccc} \sum_{} (a_x^2+a_y^2)&0&\sum_{} a_x&\sum_{} a_y\\ 0&\sum_{} (a_x^2+a_y^2)&-\sum_{} a_y&\sum_{} a_x\\ \sum_{} a_x&-\sum_{} a_y&N&0\\ \sum_{} a_y&\sum_{} a_x&0&N\\ \end{array} \right]\left[ \begin{array}{c} s\cos(\theta)\\ s\sin(\theta)\\ T_x\\ T_y\\ \end{array} \right]=\left[ \begin{array}{c} \sum_{} (a_xb_x+a_yb_y)\\ \sum_{} (a_xb_y-a_yb_x)\\ \sum_{} b_x\\ \sum_{} b_y\\ \end{array} \right] $$
I got these equations from reverse engineering the source code. The documentation then says that eigenvalue decomposition is used to solve the system. I am trying to determine where this set of equations has come from. The first and second rows are a mystery to me. I recognize the third and fourth rows as summing both sides of the mapping equation for $x$ and $y$.
$$s\cos(\theta)\sum_{} a_x-s\sin(\theta)\sum_{} a_y+NT_x=\sum_{} b_x$$ $$\sum_{} (a_xs\cos(\theta)-a_ys\sin(\theta)+T_x)=\sum_{} b_x$$ and $$s\cos(\theta)\sum_{} a_y+s\sin(\theta)\sum_{} a_x+NT_y=\sum_{} b_y$$ $$\sum_{} (a_xs\sin(\theta)+a_ys\cos(\theta)+T_y)=\sum_{} b_y$$ Does anyone know where the first and second rows come from? I assume OpenCV is implementing some form of least squares fitting to find the transform that best maps the points.
EDIT:
After playing around further, I now recognize the first row as: $$a_x*row3+a_y*row4$$ And the second row as: $$-a_y*row3+a_x*row4$$ The first two rows are created by combining the last 2 in different ways. Doesn't this create of set of equations that are not independent any more? Or does the summations in the equations and the residual error somehow maintain their independence?