6
$\begingroup$

Let $X_1, X_2,\dotsc$ be independent and identically distributed with mean $E[X]$ and variance $VAR[X]$. Let $N$ be a non-negative integer-valued random variable independent of the $X_i$'s. Show $$ VAR\left[ \sum_{i=1}^N X_i \right] = E[N]VAR[X]+(E[X])^2VAR[N] $$ I've tried expanding this in a number of different ways but I can't quite seem to get it to work out. I don't really understand how to condition on a random variable like this. Any help would be greatly appreciated.

  • 2
    Hint: use $ E (E ( Z | N )) = E (Z)$ with $Z = \sum X_i$. Hint 2: you could start with the special case $E(X)=0$, then attack the full problem.2011-10-03
  • 0
    Thanks. I knew that formula had to come in somewhere but I was having a hard time setting it up.2011-10-03
  • 0
    Also, you need $\mathbb{E}|X|<\infty$ in order for the variance to be well-defined on $\mathbb{R}^+\cup\{0,\infty\}$2015-11-12

4 Answers 4

8

The law of total variance says $$ \operatorname{var}(Y) = \operatorname{E}(\operatorname{var}(Y\mid X)) + \operatorname{var}(\operatorname{E}(Y\mid X)). $$ So $$ \begin{align} \operatorname{var}\left(\sum_{i=1}^N X_i\right) & = \operatorname{E} \left(\operatorname{var}\left(\sum_{i=1}^N X_i \mid N\right)\right) + \operatorname{var}\left(\operatorname{E} \left(\sum_{i=1}^N X_i \mid N\right)\right) \\ \\ \\ & = \operatorname{E}(N\operatorname{var}(X)) + \operatorname{var}(N\operatorname{E}(X)) \\ \\ & = \operatorname{var}(X)\operatorname{E}(N) + (\operatorname{E}(X))^2 \operatorname{var}(N). \end{align} $$

  • 0
    Heck. I submit my answer and see that about everything I just wrote is covered in your answer and your reference.2011-10-03
  • 0
    At least I learned that the ["bayesian decomposition"](http://www.whim.org/nebula/math/bayesdecomp.html) that I proved for an article on sci.math is called the law of total variance.2011-10-03
  • 1
    A catchy way of remembering the law of total variance is that the "total variance equals the mean of the conditional variance plus the variance of the conditional mean."2011-10-03
  • 0
    @Michael: I feel a bit better about my answer since I just saw that you suggested to Sasha that someone should post a proof of the law of total variance.2011-10-03
  • 0
    @Dilip: alongside with "the variance equals the mean of the squares minus the square of the mean."2011-10-03
5

Performing repeated integration yields $$ \begin{align} \operatorname{E}[X] &=\operatorname{E}_Y[\operatorname{E}_X[X\mid Y]]\tag{1} \end{align} $$ Applying $(1)$ to $X^2$ and using the fact that $\operatorname{Var}[X]=\operatorname{E}\left[X^2\right]-\operatorname{E}[X]^2$, we get $$ \begin{align} \operatorname{E}\left[X^2\right] &=\operatorname{E}_Y\left[\operatorname{E}_X\left[X^2\mid Y\right]\right]\\ &=\operatorname{E}_Y\left[\operatorname{Var}_X[X\mid Y]\right] +\operatorname{E}_Y\left[\operatorname{E}_X[X\mid Y]^2\right]\tag{2} \end{align} $$ Applying $(1)$ and $(2)$, we get $$ \begin{align} \operatorname{Var}[X] &=\operatorname{E}\left[X^2\right]-\operatorname{E}[X]^2\\ &=\operatorname{E}_Y\left[\operatorname{Var}_X[X\mid Y]\right] + \operatorname{E}_Y\left[\operatorname{E}_X[X\mid Y]^2\right]-\operatorname{E}_Y[\operatorname{E}_X[X|Y]]^2\\ &=\operatorname{E}_Y\left[\operatorname{Var}_X[X\mid Y] \right] +\operatorname{Var}_Y[\operatorname{E}_X[X\mid Y]]\tag{3} \end{align} $$ Now apply $(3)$ to the problem: $$ \begin{align} \operatorname{Var}\left[\sum_{i=1}^NX_i\right] &=\operatorname{E}_N\left[\left.\operatorname{Var}_X\left[\sum_{i=1}^NX_i\right]\right|N\right]+\operatorname{Var}_N\left[\left.\operatorname{E}_X\left[\sum_{i=1}^N X_i\right]\right|N\right]\\ &=\operatorname{E}_N[N\operatorname{Var}[X]]+\operatorname{Var}_N[N\operatorname{E}[X]]\\ &=\operatorname{E}[N]\operatorname{Var}[X]+\operatorname{Var}[N]\operatorname{E}[X]^2\tag{4} \end{align} $$

  • 0
    As Michael Hardy points out, $(3)$ is called the [Law of Total Variance](http://en.wikipedia.org/wiki/Law_of_total_variance).2011-10-04
2

Another way to do this: let $Y_i = 1$ if $N \ge i$, $0$ otherwise. Then your sum is $$S = \sum_{i=1}^N X_i = \sum_{i=1}^\infty Y_i X_i$$ (I won't worry about convergence of infinite sums: if you wish you can use a truncated version of $N$ and then take limits). So $$\text{var}(S) = \sum_{i=1}^\infty \text{var}(Y_i X_i) + 2 \sum_{i=1}^\infty \sum_{j=1}^{i-1} \text{cov}(Y_i X_i, Y_j X_j)$$ Now $\text{var}(Y_i X_i) = E[Y_i^2 X_i^2] - E[Y_i X_i]^2 = E[Y_i] \text{var}(X) + \text{var}(Y_i) E[X]^2$, while for $j < i$, $\text{cov}(Y_i X_i, Y_j X_j) = E[Y_i Y_j X_i X_j] - E[Y_i X_i] E[Y_j X_j] = E[Y_i] (1 - E[Y_j]) E[X]^2$, so that $$ \text{var}(S) = \sum_{i=1}^\infty E[Y_i] \text{var}(X) + \sum_{i=1}^\infty \text{var}(Y_i) E[X]^2 + 2 \sum_{i=1}^\infty \sum_{j=1}^{i-1} E[Y_i](1 - E[Y_j]) E[X]^2$$ Now doing the same calculation with $X_i$ replaced by 1 (since $N = \sum_{i=1}^\infty Y_i$), $$ \text{var}(N) = \sum_{i=1}^\infty \text{var}(Y_i) + 2 \sum_{i=1}^\infty \sum_{j=1}^{i-1} E[Y_i](1 - E[Y_j])$$ so that $$ \text{var}(S) = \sum_{i=1}^\infty E[Y_i] \text{var}(X) + \text{var}(N) E[X]^2 = E[N] \text{var}(X) + \text{var}(N) E[X]^2 $$

1

Let $Y = \sum_{i=1}^N X_i$. Notice, that the characteristic function of $Y$ can be expressed as composition of characteristic functions of $X$ $\phi(t)$ and the probability generating function of $N$, $g(s)$: $$ \psi(t) = \mathbb{E}( \exp( i Y t) ) = \mathbb{E}\left( \mathbb{E}( \exp( i Y t) \vert N) \right) = \mathbb{E}\left( \phi(t)^N \right) = \sum_{k=0}^\infty \phi(t)^k \mathbb{P}(N=k) = g(\phi(t)). $$ Notice that the variance of $Y$ is related to its moments $\mathrm{Var}(Y) = m_2(Y) - m_1(Y)^2$, and that $m_r(Y) = (-i)^r \psi^{(r)}(0)$, so that $\mathrm{Var}(Y) = -\psi^{\prime\prime}(0)+\left( \psi^\prime(0)\right)^2$. Using $\psi = g \circ \phi$:

$$ \psi^{\prime}(0)= g^\prime(1) \times \phi^\prime(0) = i \mathbb{E}(N) \mathbb{E}(X) $$ and

$$ \psi^{\prime\prime}(0)= g^{\prime\prime}(1) \phi^\prime(0)^2 + g^\prime(1) \phi^{\prime\prime}(0) = -\left( \mathbb{E}(X)^2 \cdot \mathbb{E}(N(N-1)) + \mathbb{E}(N) \cdot \mathbb{E}(X^2) \right) $$ Combining these together will yield the result you seek to establish. $$ \begin{eqnarray} \mathrm{Var}(Y) &=& \mathbb{E}(X)^2 \cdot \left( \mathbb{E}(N^2) -\mathbb{E}(N) \right) + \mathbb{E}(N) \cdot \left( \mathrm{Var}(X) + \mathbb{E}(X)^2 \right) - \mathbb{E}(N)^2 \cdot \mathbb{E}(X)^2 \\ &=& \mathbb{E}(X)^2 \cdot \left( \mathrm{Var}(N) + \mathbb{E}(N)^2 -\mathbb{E}(N) \right) + \mathbb{E}(N) \cdot \left( \mathrm{Var}(X) + \mathbb{E}(X)^2 \right) - \mathbb{E}(N)^2 \cdot \mathbb{E}(X)^2 \\ &=& \mathbb{E}(X)^2 \cdot \mathrm{Var}(N) + \mathbb{E}(N) \cdot \mathrm{Var}(X) \end{eqnarray} $$

Since I used $g^\prime(1) = \mathbb{E}(N)$ and $g^{\prime\prime}(1) = \mathbb{E}(N(N-1))$ I should note that they follow from the definition of the probability generating function $g(s) = \sum_{k=0}^\infty s^k \mathbb{P}(N=k)$. Indeed $g^\prime(1) = \sum_{k=0}^\infty k \mathbb{P}(N=k) = \mathbb{E}(N)$, and $g^{\prime\prime}(1) = \sum_{k=0}^\infty k (k-1) \mathbb{P}(N=k) = \mathbb{E}(N (N-1))$.

  • 0
    Certainly a more complicated answer than one gets by routine application of the law of total variance. (But maybe one should also post a proof of _that_?)2011-10-03
  • 0
    @MichaelHardy Yes, the machinery is heavy, and a leonboy and yourself gave much neater solutions (+1). I was not aware of the law of total variance, so I'll add it to my bag of tools.2011-10-03