2
$\begingroup$

This is my first post here, and I'm not a mathematician, so please go easy on me :)

In statistics there is a geometric interpretation of correlation that uses basic vector geometry. This is fairly straight forward, even for me: each of two vectors of data are transformed so that the length of each new vector is equal to the standard deviation of the elements in the original vector. The i'th element of the new vector x is then: $x[i]=(X[i]-mean(X))/\sqrt{n-1}$ , where n is the number of elements in the vector and X is the original data vector.

Then, the correlation between the two original vectors of data is then simply the cosine of the angle between the two transformed vectors. Neat !

Now, the really interesting (for me) thing is that if we have 3 independent vectors of data generated from the same probability distribution, let's call them X Y and Z, so the correlation between any 2 of them is close to zero, then the correlation between X/Z and Y/Z is usually close to 0.5-0.6, where X/Z is the element by element division of X by Z. I say "usually" because there are certain provisions, such that the data come from a probability distribution where near-zero values are impossible or highly unlikely (and some others which I won't go into here).

So you are probably wondering why I am posting this here and not on stats.se. Well, that's because I'm trying to get a handle on the geometric interpretation of this, and in particular, the geometric interpretation of dividing one vector by another, element by element, and I guess this is more of a maths concept.

EDIT/UPDATE: I was informed (in the real world) that it is not correct to try to interpret this element-by-element division of one vector by another geometrically. Instead, we should first take the natural log of the original data vectors (this is justified since the data are physical measurements), transform them as above, and then interpret the vectors $\ln (x)-\ln (z)$ and $\ln (y)-\ln (z)$ geometrically. Apparently "it follows" that the angle between them will be $\pi /3 $ giving the required correlation of 0.5. Unfortunately it didn't "follow" for me ! Could anyone help explain ?

Thanks for reading.

  • 0
    Thank you André. I have received some offline advice on this and u$p$dated the question above.2012-03-04

0 Answers 0