1
$\begingroup$

I am trying to figure out how to use the Mahalanobis distance still. I am having trouble figuring out how to produce my own covariance matrix. I guess the relevant link is http://en.wikipedia.org/wiki/Estimation_of_covariance_matrices, but I am still stuck. What I have is a set of points in space, every single point in my data set. I want to find the distance between any two of them using this type of distance, but how do I form the covariance matrix? The webpage is not very clear to me.

1 Answers 1

1

Let the points be $\{\mathbb{x}_i\}_{i=1}^N$. Then you can compute the sample covariance as follows:

\begin{align} \Sigma &= \frac{1}{N-1}\sum_{i=1}^N (\mathbf{x}_i - \mu)(\mathbf{x}_i - \mu)^T \end{align}

where

\begin{align} \mu &= \frac{1}{N}\sum_{i=1}^N \mathbf{x}_i \end{align} is the sample mean.

Note that I have used $N-1$ in the denominator for $\Sigma$ instead of $N$ to make the estimator unbiased.

  • 0
    Those are pretty much the same equations on the link I gave. But as far as I understand, a covariance matrix is a square matrix. What would an entry at row i, column j look like?2012-08-27
  • 0
    Each term in the expression for $\Sigma$ is a square matrix, since it is the outer-product of a vector with itself. The $(k,l)$ entry would look like: $\frac{1}{N-1}\sum_{i=1}^N (x_{i,k} - \mu_k)(x_{i,l} - \mu_l)$, where $x_{i,k}$ is the $k^{th}$ entry of $\mathbf{x}_i$. This is nothing but the sample covariance between components $k$ and $l$ of $\mathbf{x}$.2012-08-27
  • 0
    Oh I see..My thinking was that I would get a 1x1 matrix, but ofcourse I am not multiplying it like that. Thanks!2012-08-27
  • 0
    You would get a $1\times 1$ matrix if you were taking inner products of the column vectors. That would just give you the trace (sum of diagonal entries) of the covariance matrix.2012-08-27