1
$\begingroup$

I have an engineer friend who wants to have intuition about covariance.

I can describe its formula and depict some scatters but I couldn't give him some insights.

He's more comfortable with variance of a series of numbers (but not too comfortable) but he want to touch the meaning of covariance (specially for random variables).

What do you advise?

  • 0
    What's wrong with the introductory paragraph on Wikipedia: https://en.wikipedia.org/wiki/Covariance ?2017-01-01
  • 0
    @parsiad He feels the concept of correlation as an engineer. but he can't deal with why we define it in such a way. It makes no sense for him2017-01-01
  • 0
    This question might be better of at http://matheducators.stackexchange.com/ This might also be interesting to you: http://stats.stackexchange.com/questions/18082/how-would-you-explain-the-difference-between-correlation-and-covariance2017-01-01
  • 0
    A note on correlation vs.co-variance, since i ended up slightly confused after reading all answers: " (In fact, correlation coefficients can simply be understood as a normalized version of covariance.)" -from https://en.wikipedia.org/wiki/Covariance2017-02-13
  • 0
    @ntg I didn't get my desired answer. Thanks for your answer...2017-02-13
  • 1
    I have brought some slight improvements to my presentation.2017-02-13
  • 0
    @JeanMarie Thanks...2017-02-13

2 Answers 2

2

Here is an example that I have been experimenting with students for the same purpose, understanding covariance and correlation, two highly related concepts, on a simple example.

In fact, I will stress the correlation aspect for two reasons:

  • it is a little simpler to the newcomer in probability.

  • correlation is interpretable in terms of simulation.

Consider four measures of a same quantity, modelized by four i.i.d. (independent identically distributed) random variables $X_1, X_2, X_3, X_4.$

Let us now define random variables as the (positive or negative) gap between consecutive measures: $$\begin{cases}Y_1&=&X_2-X_1\\Y_2&=&X_3-X_2\\Y_3&=&X_4-X_3\end{cases}$$

What is the correlation matrix of $(Y_1,Y_2,Y_3) ?$

A first remark is that $corr(Y_1,Y_3)=0$ because $X_3,X_4$ being independent from $X_1,X_2$, $Y_3$ is independent from $Y_1$.

On the contrary, $Y_2$ shouldn't be independent from $Y_1$, but what kind of correlation do we await ?

Let us assume that the $X_k$ have a uniform distribution on $[0,1]$.

Running the following simulation program, written in Matlab, but understandable without any knowledge of this language:

 n=1000000;
 X=rand(4,n);
 Y=diff(X);
 corrcoef(Y')  

where $X$ is a $4 \times n$ array of $n$ realizations of $(X_1,X_2,X_3,X_4)$, $Y$ is a $3 \times n$ array of $n$ realizations of $(Y_1,Y_2,Y_3)$ and corrcoef(Y') is the $ 3 \times 3$ "experimental" matrix of correlation coefficients $r_{ij}=corr(Y_i,Y_j)$. Here is such a matrix obtained by running this program:

$$\pmatrix{1.0000&-0.4998&0.0014\\-0.4998&1.0000&-0.5020\\0.0014&-0.5020&1.0000} \ \ \ \text{to be compared to :} \ \ \ \pmatrix{1&-1/2&0\\-1/2&1&-1/2\\0&-1/2&1}$$

which is the theoretical matrix, that we are going to justify. More precisely, it suffices to justify that

$$\tag{1}corr(Y_1,Y_2)=-\tfrac12.$$

A rigorous justification of (1) will be given in the Appendix.

Nevertheless, in the spirit of your question, let us show that there is an intuitive explanation in (1), in particular about the negative sign of correlation coefficient. Imagine that $Y_1=X_2-X_1$ is bigger that $1/2$, meaning that $X_2$ is closer to $1$ than to $0$; in such a case, it is more likely that $X_3$ is smaller than $X_2$, i.e., that $Y_2=X_3-X_2<0$. It means that the knowledge of $Y_1$ permits to prognose, to a certain extend, the sign of $Y_2$.

A global understanding of this negative trend is as well provided by the following graphical representation where each point has coordinates $(Y_1,Y_2)$ (think to the equivalent ellipse): the cases where $Y_1$ and $Y_2$ have different signs are dominant.

enter image description here

Appendix : Rigorous proof of relationship (1)

Let us first recall that, if $X$ is a uniformly distributed random variable on $[0,1]$, then

$$\tag{2}V(X)=\frac{1}{12}.$$

As a direct consequence, due to the mutual independence of $X_1$ and $X_2$:

$$V(Y_1)=V(X_2-X_1)=V(X_2)+(-1)^2 V(X_1)=\frac{1}{6}$$

Using bilinearity property of covariance function, and the fact that the $X_k$ are mutually independent:

$cov(Y_1,Y_2)=cov(X_2-X_1,X_3-X_2)$

$=cov(X_2,X_3)-cov(X_2,X_2)-cov(X_1,X_3)+cov(X_1,X_2)$

$=0-var(X_2)-0+0=-\frac{1}{12}$

Thus, using ($2$),

$$corr(Y_1,Y_2)=\frac{cov(Y_1,Y_2)}{\sqrt{V(Y_1)V(Y_2)}}=\frac{-1/12}{1/6}=-\frac12,$$

as expected.

Remark: A similar computation could be done for $X_1,X_2,X_3,X_4$ having a $N(0,1)$ distribution. We would obtain the same correlation matrix.

  • 0
    Thanks. What if we consider absolute differences? Will it cause to positive correlation?2017-01-02
  • 1
    @Bellmondo I just checked by taking absolute values in the simulation above . The correlation coefficient becomes positive, but, rather surprisingly, with a much smaller value, around 0.1 (practically speaking: no correlation...)2017-01-02
  • 0
    I thought it could be positive because if $Y_1$ is big the. Values of $X_1,X_2$ must be extreme and so the $Y_2$ is bigger than average2017-01-02
  • 1
    That's right...2017-01-02
1

A comment from the original poster says "He feels the concept of correlation as an engineer. but he can't deal with why we define it in such a way." Perhaps this means the question should be taken to be about correlation, although it is stated as being about covariance.

Note that when $x$ and $y$ are both bigger than average, the product of their deviations from the average is positive, and so it is also when they are both smaller than average. And being both bigger or both smaller is consistent with either of $x$ and $y$ increasing on average as the other increases. But if either is bigger and the other smaller, then the product of deviations is negative, and that's consistent with either decreasing on average as the other increases. And points a long way from the fulcrum exert more torque on the lever than do points close to it.

  • 0
    Yes. I say it and he understand it! but he does not like it because it make no intuition for him... as an engineer he likes to feel what he works with2017-01-01