2
$\begingroup$

I'm having a hard time understanding the derivation for least squares approximation. The following step doesn't make sense to me...

$LS = \sum\limits_{i}|a_ix - b_i|^2 = \|Ax - b\|^2 = x^\top(A^\top A)x - 2x^\top(A^\top b) + \|b\|^2$

Can somebody please explain this to me? How is this correct? What are the rules in place that make these equalities true?

1 Answers 1

6

The first equality, $LS = \sum_i |a_i x - b_i |^2$, is a definition. I expect that $x$ and $b$ are column vectors, and $a_i$ are the rows of $A$, so that $a_i x$ and $b_i$ are scalars.

Let $c$ be the vector such that $c_i = a_i x - b_i$. By the definition of matrix multiplication, $c = Ax - b$. Also, $\lVert c \rVert^2 = \sum_i c_i^2 = \sum_i \lvert c_i \rvert^2$, so the second equality holds.

It is also a fact that $\lVert c \rVert^2 = c^T c$. Replace $c$ with $Ax - b$ in that expression and expand the right-hand side, and you'll obtain the third equality.