I am following this article on face recognition. In "calculating eigenfaces" section, the authors present a solution for the problem of calculating a very big matrix:
Let $A_{N^2\times M}$ be an $M$ sized dataset, where each column in an $N\times N$ image. Instead of calculating the $M$ large eigenvectors of the $N^2\times N^2$ co-variance matrix they calculate the $M$ eigenvectors of $L=A^TA$ matrix which is of size $M\times M$.
- Why is this a valid\good enough solution?
- What are the criteria for a largest vector? larger in which seance?