4
$\begingroup$

What is the relationship between (first) principal component(s) and the correlation matrix or the average correlation of the data. For example, in an empirical application I observe that the average correlation is almost the same as the ratio of the variance of the first principal component (first eigenvalue) to the total variance (sum of all eigenvalues).

Is there a mathematical relationship?

1 Answers 1

0

Short Answer: The principal components are the eigenvectors of the correlation matrix. Therefore, each principle component $(V)$ multiplied by the correlation matrix $(C)$ will give us the same correlation matrix times the corresponding eigenvalue $\lambda$:

$$ CV = \lambda V $$

Details: Given n-dimensional data ($x_i \in R^n$), suppose we have $m$ datapoints represented as rows in a matrix $X$ (An $ m\times n$ matrix). Given that $Cor(i,j)$ is the correlation of 2 dimensions $i$ and $j$, the correlation matrix is defined as:

$$ C = \begin{bmatrix} Cor(0,0) & Cor(0,1) & \cdots & Cor(0, n-1) \\ Cor(1,0) & Cor(1,1) & \cdots & Cor(1, n-1) \\ \vdots & \vdots & \ddots & Cor(n-1, n-1) \end{bmatrix} $$

Since the correlation matrix is a square matrix of size ($n \times n$), there are n possible eigenvectors for this matrix, and these vectors are the principle components of this data. each principle component $V$ is of size $n \times 1$, and it's corresponding eigenvalue $\lambda$ is a scalar value.