2
$\begingroup$

I'm studying pattern classification and there are some terms that I'm not familiar with.

The first term I don’t understand is “covariance matrix”. Suppose I have database of $k$ samples, where each one is a feature vector of length $d$.

How do I build the covariance matrix from my database, and what is the meaning of that matrix?

  • 3
    This is a rather large topic. Have you tried consulting standard sources such as statistics textbooks or even [wikipedia](http://en.wikipedia.org/wiki/Covariance_matrix)? In short, the covariance matrix encodes many useful properties of the underlying distribution, the rough shape of the distribution being one of them.2012-12-09
  • 1
    wikipedia was too generic for me..2012-12-09
  • 1
    The article details exactly how you construct a covariance matrix. As for what the matrix means, it is a measure of the spread of the data. Is there something specific that you would like to know?2012-12-09

1 Answers 1

3

The covariance matrix consists of covariances $\sigma_{i,j}$ and standard deviations $\sigma_{i,i} = \sqrt{\mathrm{Variance}(X_i)}$ on the diagonals.

$$\begin{bmatrix} \color{Olive}{\sigma_{1,1}} & \sigma_{1,2} & \sigma_{1,3} \\ \sigma_{2,1} & \color{Olive}{\sigma_{2,2}} & \sigma_{2,3} \\ \sigma_{3,1} & \sigma_{3,2} & \color{Olive}{\sigma_{3,3}} \end{bmatrix}$$

Notice that $\sigma_{i,j} = \sigma_{j,i}$. (Think about the meaning of "correlation", which derives from $\sigma_{i,j}$. I can't be correalted with you if you're not correlated with me.)

You might want to check out http://stats.stackexchange.com as well.

  • 0
    Hey, don't take it the wrong way, but I came across your answer, and I'm not sure it is completely accurate - the entries in the diagonal are variances, not standard deviations. Wouldn't you agree?2017-02-09