2
$\begingroup$

I just want to confirm whether this is correct or not:

The covariance of $X$ and $Y$ is equal to the standard deviation of $X$ times the standard deviation of $Y$.

or, in mathematical notation,

$$ \mathrm{Cov}(X,Y) = \sqrt{\mathrm{Var(X)}}\sqrt{\mathrm{Var}(Y)} $$

  • 0
    i see. Thanks for help2011-04-27
  • 0
    @cardinal: ... and perhaps also posted the answer as an answer instead of as a comment? As it is now, the question is answered even though it appears as unanswered on the front page.2011-04-27
  • 0
    @Hans, fair enough. I've copied my comment into an answer and deleted the comments.2011-04-27
  • 0
    @Hans, also, if you don't mind, please review my edit of the question and approve it, if appropriate. I think that will also make my answer match up better with the question. Cheers.2011-04-27
  • 0
    @cardinal: Looks fine to me. :)2011-04-27

2 Answers 2

10

This is false in general. In fact, it is true if and only if $X=aY+b$ (almost surely) for some fixed constants $a \geq 0$ and $b \in \mathbb{R}$. That is $X$ and $Y$ must be positively linearly related for this to hold. Wikipedia also has a decent page on this.

For a counterexample to your statement, consider any two independent random variables $X$ and $Y$ each with strictly positive variance. Then, $ \mathrm{Cov}(X,Y) = 0 \>, $ but, $ \sqrt{\mathrm{Var}(X)} \sqrt{\mathrm{Var}(Y)} > 0 \>. $


A quick proof (and a slick one, I think; it's not my own) of the if and only if assertion uses the Cauchy–Schwarz inequality and goes as follows. Let $U$ and $V$ be random variables such that $\newcommand{\e}{\mathbb{E}}\e U^2 < \infty$ and $\e V^2 < \infty$. Then, $|\e UV|^2 < \e U^2 \e V^2$ if and only if $\e (t U - V)^2 > 0$ for all $t$. But, if $\e(t_0 U - V)^2 = 0$ for some $t_0$, then $t_0 U = V$ almost surely. Now, set $U = X - \e X$ and $V = Y - \e Y$ to get the desired conclusion.

  • 1
    +1 though with a pedantic point: suppose $Z$ is a uniform random variable from $[0,1)$, $Y=1$ if $Z$ is rational and $Y=0$ if $Z$ is irrational, and $X=1-Y$. Then $\mathrm{Cov}(X,Y) = \sqrt{\mathrm{Var(X)}}\sqrt{\mathrm{Var}(Y)}$ even though $a \lt 0$.2012-05-14
  • 0
    @Henry: (+1) Thank you; it's not pedantic at all. I'm not sure how you came across such an old answer, but I appreciate very much your comment. I'll try to think of a way to clean up the statement. This should only be possible if one of the random variables is almost surely constant.2012-05-15
6

While cardinal has answered the question, I wanted to add a little something extra.

The Cauchy-Schwarz inequality shows that $\left|\operatorname{Cov}(X,Y)\right|\leq \sqrt{\operatorname{Var}(X)\operatorname{Var}(Y)}$, with equality if and only if $X$ and $Y$ are linearly dependent. In general, you can take the ratio $\rho=\frac{\operatorname{Cov}(X,Y)}{ \sqrt{\operatorname{Var}(X)\operatorname{Var}(Y)}}$ which will be between $-1$ and $1$. This gives the correlation between $X$ and $Y$. It is $1$ when the variables are perfectly correlated (one goes up exactly when the other does), $-1$ when they are perfectly anti-correlated (one goes up exactly when the other goes down), $0$ when they are independent (knowing that one goes up tells you nothing about the behavior of the other), and in general gives a measure of the behavior when things are somewhere in between.

One of the first places where this comes up is least squares approximation. . While you can always do a least squares approximation to get a line of best fit for your data points, the correlation coefficient tells you whether your line of best fit is actually a good fit. It will be near $0$ for data that is essentially random, and small in magnitude for data which is non-linear.

  • 0
    (+1) Your last statement depends strongly on the type of nonlinearity present. There are many cases where even in the presence of considerable nonlinearity, the correlation coefficient will not be small in magnitude.2011-04-27
  • 0
    @cardinal Yes. I guess that by "non-linear" I actually mean "not well approximated by anything linear" or "very non-linear". On the other hand, this could be used as a definition of what it means for noisy data to be linear/non-linear.2011-04-27