2
$\begingroup$

Consider $n$ independent trials, each of which results in any of the outcomes $i$,$\ i=1,2,3$, with respective probabilities $p_1,p_2,p_3,\ p_1+p_2+p_3=1$. Let $N_i$ denote the number of trials that result in outcome $i$. How do I show that $Cov(N_1,N_2)=-np_1p_2$? Why is it intuitive that the covariance is negative?

Currently, I have the following.

For $i=1,...,n$ let

$X_i = \begin{cases} 1 & \text{if trial} \ i \text{ results in outcome 1} \\ 0 & \text{if trial } i \text{ does not result in outcome 1}\end{cases}$

Similarly, for $j=1,...,n$, let

$Y_j = \begin{cases} 1 & \text{if trial} \ j \text{ results in outcome 2} \\ 0 & \text{if trial } j \text{ does not result in outcome 2}\end{cases}$

How do I argue the following?

$N_1 = \sum_{i=1}^{n}X_i,\ N_2=\sum_{j=1}^{n}Y_j$

How should I proceed using the properties of covariance from this point?

  • 1
    The above was the intuition.2012-09-29

2 Answers 2

2

Why intuitive? Because if we know that $N_1$ is biggish, then there is an increased likelihood that $N_2$ is smallish, and vice-versa. They are negatively correlated.

To compute the covariance, note that $E(N_1)=E(X_1+\cdots X_n)=E(X_1)+\cdots+E(X_n).$ But $E(X_i)=p_1$, so $E(N_1)=np_1$. Similarly, $E(N_2)=np_2$.

Finally, we must find $E(N_1N_2)$. So we want $E((X_1+X_2+\cdots+X_n)(Y_1+Y_2+\cdots +Y_n)).$ Imagine expanding $(X_1+X_2+\cdots+X_n)(Y_1+Y_2+\cdots +Y_n).$ Lots of terms. Note first that for any $i$, $X_iY_i=0$, since $X_i$ and $Y_i$ cannot be $1$ at the same time. So we can forget about these, and our expanded product is a sum of terms of the shape $X_iY_j$ where $i\ne j$. But $X_i$ and $Y_j$ are independent, so $E(X_iY_j)=E(X_i)E(Y_j)=p_1p_2$.

Finally, how many terms are there of the form $X_iY_j$ with $i\ne j$? For every choice of $i$, there are $n-1$ possibilities for $j$, giving a total of $n(n-1)$.

We conclude that $E(N_1N_2)=n(n-1)p_1p_2$. Subtract $E(N_1)E(N_2)$, which is $n^2p_1p_2$.

Forgot: Why is $N_1=X_1+X_2+\cdots X_n$? Call the three possible outcomes $A$, $B$, and $C$. We have $X_i=1$ if $A$ happens, $X_i=0$ otherwise. So adding up the $X_i$ just means adding a $1$ whenever $A$ happens. Basically for $i=1$ to $n$ we are writing down a $1$ whenever $A$ happens. The number of these $1$'s, which is the sum of these $1$'s, is the number of times $A$ happened. This by definition is $N_1$.

0

Intuitively, $N_2$ will more often be larger when $N_1$ is small than when $N_1$ is large (because there is more space for more type 2 outcomes when there are fewer type 1 outcomes) and this leads to a negative covariance.

I would calculate using $Var(X_3) = Var(X_1+X_2) = Var(X_1)+2Cov(X_1,X_2)+Var(X_2)$

$np_3(1-p_3) = np_1(1-p_1)+2Cov(X_1,X_2)+np_2(1-p_2)$

$n(1-p_1-p_2)(p_1+p_2) = np_1(1-p_1)+2Cov(X_1,X_2)+np_2(1-p_2)$

and expand and solve.