Suppose a $nxn$ matrix A is diagonalizable with an orthonormal basis of eigenvectors, $\beta$ = { v$_1$, v$_2$, v$_3$, ... v$_n$ } and corresponding eigenvalues $\lambda$$_1$ $\geq$ $\lambda$$_2$ $\geq$ ... $\geq$ $\lambda$$_n$. In addition, A is just a matrix of data, where we have for each row a data sample with n features (columns = features of the sample).
Then if i wanted to approximate this matrix with a lower rank matrix would the following make sense?
First we know that the columns of A are in the range of L$_A$ when we think of A as a linear transformation on some vector. (Where L$_A$ is left multiplication by A).
Since any y $\in$ range of L$_A$ is expressed by L$_A$$(x)$ for x$\in$F$^n$ and we know we can express x as a linear combination of the eigenvectors in $\beta$ since they form a basis.Then suppose x = a$_1$v$_1$ + a$_2$v$_2$ + ... a$_n$v$_n$, for some scalers a$_i$. Then since L$_A$ is linear we can write L$_A$$(x)$ as
L$_A$(x)= a$_1$L$_A$(v$_1$) + a$_2$L$_A$(v$_2$) + ... a$_n$L$_A$(v$_n$)
I claim that since $\beta$ is made up of orthonormal eigenvectors, it would be very easy to find an x in the codomain such that L$_A$$(x)$ = some column in our data matrix. (using inner product results for orthonormal vectors to find scalers)
Then we know that most likely our vector L$_A$(x) is likely to end up somewhere close to the sum of the first few terms of the above where the eigenvectors v$_1$,v$_2$,...,v$_i$ have large eigenvalues since they will scale our v$_i$ by the largest amount, and the smaller eigenvalues will have less of an effect on where the final vector L$_A$$(x)$ ends up.
So it would seem that for each column back in our original data matrix A we could give a good approximation of the column with just the first d eigenvectors where d < n. Hence we obtain a matrix that has an image of lesser dimension, hence the rank is reduced of our data matrix.
I realize this is essentially why svd or pca is used. would the method above work? would it allow us to drop out any columns? From what I understand that's the goal when reducing data to more manageable sizes.
Thanks!