2
$\begingroup$

I just want to confirm whether this is correct or not:

The covariance of $X$ and $Y$ is equal to the standard deviation of $X$ times the standard deviation of $Y$.

or, in mathematical notation,

$ \mathrm{Cov}(X,Y) = \sqrt{\mathrm{Var(X)}}\sqrt{\mathrm{Var}(Y)} $

  • 0
    @cardinal: Looks fine to me. :)2011-04-27

2 Answers 2

10

This is false in general. In fact, it is true if and only if $X=aY+b$ (almost surely) for some fixed constants $a \geq 0$ and $b \in \mathbb{R}$. That is $X$ and $Y$ must be positively linearly related for this to hold. Wikipedia also has a decent page on this.

For a counterexample to your statement, consider any two independent random variables $X$ and $Y$ each with strictly positive variance. Then, $ \mathrm{Cov}(X,Y) = 0 \>, $ but, $ \sqrt{\mathrm{Var}(X)} \sqrt{\mathrm{Var}(Y)} > 0 \>. $


A quick proof (and a slick one, I think; it's not my own) of the if and only if assertion uses the Cauchy–Schwarz inequality and goes as follows. Let $U$ and $V$ be random variables such that $\newcommand{\e}{\mathbb{E}}\e U^2 < \infty$ and $\e V^2 < \infty$. Then, $|\e UV|^2 < \e U^2 \e V^2$ if and only if $\e (t U - V)^2 > 0$ for all $t$. But, if $\e(t_0 U - V)^2 = 0$ for some $t_0$, then $t_0 U = V$ almost surely. Now, set $U = X - \e X$ and $V = Y - \e Y$ to get the desired conclusion.

  • 0
    @Henry: (+1) Thank you; it's not pedantic at all. I'm not sure how you came across such an old answer, but I appreciate very much your comment. I'll try to think of a way to clean up the statement. This should only be possible if one of the random variables is almost surely constant.2012-05-15
6

While cardinal has answered the question, I wanted to add a little something extra.

The Cauchy-Schwarz inequality shows that $\left|\operatorname{Cov}(X,Y)\right|\leq \sqrt{\operatorname{Var}(X)\operatorname{Var}(Y)}$, with equality if and only if $X$ and $Y$ are linearly dependent. In general, you can take the ratio $\rho=\frac{\operatorname{Cov}(X,Y)}{ \sqrt{\operatorname{Var}(X)\operatorname{Var}(Y)}}$ which will be between $-1$ and $1$. This gives the correlation between $X$ and $Y$. It is $1$ when the variables are perfectly correlated (one goes up exactly when the other does), $-1$ when they are perfectly anti-correlated (one goes up exactly when the other goes down), $0$ when they are independent (knowing that one goes up tells you nothing about the behavior of the other), and in general gives a measure of the behavior when things are somewhere in between.

One of the first places where this comes up is least squares approximation. . While you can always do a least squares approximation to get a line of best fit for your data points, the correlation coefficient tells you whether your line of best fit is actually a good fit. It will be near $0$ for data that is essentially random, and small in magnitude for data which is non-linear.

  • 0
    @cardinal Yes. I g$u$ess that by "non-linear" I actually mean "not well approximated by anything linear" or "$v$ery non-linear". On the other hand, this could be used as a definition of what it means for noisy data to be linear/non-linear.2011-04-27