1
$\begingroup$

We are looking at the sample variance ( of $n$ iid random variables)

$S^2 = \frac{1}{n-1} \Sigma_{i=1}^{n} (Y_i - \bar{Y} )^2$ where $\bar{Y}$ is the sample average.

My lecturer has rearranged this to:

$S^2 = \frac{1}{n-1} ( \Sigma_{i=1}^{n}Y_i - \frac{1}{n} \Sigma_{i \ne j} \ Y_i Y_j) $

But I cannot understand how he got that. It's talked about like a trivial manipulation but I can't see it.

2 Answers 2

1

This is simply not true. Let $n=3$, $Y_1=1$, and $Y_2=Y_3=0.$ $\overline Y$ is supposed to be the average, that is $\overline Y=\frac13.$ Then on the one hand

$$S^2 = \frac{1}2 \sum_{i=1}^{3} \left(Y_i - \frac13 \right)^2=$$ $$=\frac12\left(\frac49+\frac19+\frac19\right)=\frac{1}{3}.$$

On the other hand,

$$S^2 = \frac12 \left( \sum_{i=1}^{3}Y_i - \frac{1}{3} \sum_{i \ne j} \ Y_i Y_j\right) =$$ $$=\frac12\left(1-\frac130\right)=\frac12.$$

0

It is possible to end up with an expression similar to your teacher's. Note that $$\begin{align*} \sum_{i} \left(Y_i - \bar{Y}\right)^2 &= \sum_{i} \left( Y_i^2 -2Y_i\bar{Y} + \bar{Y}^2 \right) \\ &= \sum_{i} Y_i^2 -2\bar{Y}\sum_{i} Y_i + n\bar{Y}^2 \\ &= \sum_{i} Y_i^2 -2\bar{Y}n \bar{Y} + n\bar{Y}^2 \\ &= \sum_{i} Y_i^2 - n\bar{Y}^2 \end{align*}$$ Moreover, $$\bar{Y}^2 = \left(\frac{1}{n} \sum_{i} \sum_{j} Y_i\right)^2 = \frac{1}{n^2}\left( \sum_i Y_i^2 + \sum_{i,j : i\neq j} Y_i Y_j \right)$$ Therefore, $$\sum_{i} \left(Y_i - \bar{Y}\right)^2 = \sum_{i} Y_i^2 - \frac{1}{n}\left( \sum_i Y_i^2 + \sum_{i,j : i\neq j} Y_i Y_j \right) = \sum_i \left( \frac{n-1}{n}Y_i^2 - \sum_{j : j\neq i} Y_i Y_j \right)$$ And therefore, $$S^2 = \sum_i \left( \frac{1}{n}Y_i^2 - \frac{1}{n-1} \sum_{j : j\neq i} Y_i Y_j \right)$$