2
$\begingroup$

Assume there are $n$ independent random variables $X_1,X_2,\ldots,X_n$ and i wonder why the sample variance is $S^2=\frac{\sum\limits_{i=1}^n \ (X_i-X)^2}{n-1}$ where $X$=$\frac{X_1+X_2+\cdots+X_n}{n}$ instead of $S^2=\frac{\sum\limits_{i=1}^{n}\ (X_i-X)^2}{n}$.

  • 0
    Did you try to compute the expectation of $S^2$ in both cases? (And please add some squares in your formulas.)2012-01-02

2 Answers 2

1

You have said that $X_1, X_2, \ldots, X_n$ are independent, but not added, as most people do, that they are also identically distributed (or at the very least, have the same mean and variance). With this added condition, we have $E\left[\sum_{i=1}^n (X_i-X)^2\right] = (n-1)\sigma^2$ where $\sigma^2$ is the common variance of the $n$ random variables. Thus, defining the sample variance as $S^2 = \frac{1}{n-1}\sum_{i=1}^n (X_i-X)^2$ has the nice property that $E[S^2] = \sigma^2$. As Didier Piau has already suggested to you in a comment, you should work out the expectation calculation for yourself.

You might also want to try and work out what happens when the $X_i$ have the same mean $\mu$ but different variances $\sigma_i^2$, and also when they have different means $\mu_i$ and variances $\sigma_i^2$.

  • 0
    No, since $E[X_i]=\mu=E[X]$, $X_i-X$ is a zero-mean random variable, and so $\text{var}(X_i-X)=E[(X_i-X)^2]$ can be computed as \begin{align*}\text{var}(X_i-X)&=\text{var}(X_i)+\text{var}(X)-2\text{cov}(X_i,X)\\&=\sigma^2+\frac{\sigma^2}{n}-2\text{cov}(X_i,X)\\&=\frac{n-1}{n}\sigma^2.\end{align*} I will leave it to you to work out why $\text{cov}(X_i,X)=\frac{\sigma^2}{n}$. Putting it all together, $E\left[\sum_{i=1}^n (X_i-X)^2\right]=\sum_{i=1}^{n} E\left[(X_i-X)^2\right] = (n-1)\sigma^2.$ Note that _your_ assertion is missing a square in the middle sum.2012-01-03
1

There is a Wikipedia article about this: Bessel's correction. (I wrote some of it myself, but in this case others did most of it.)

Bessels correction eliminates bias. Eliminating bias is sometimes a very bad idea, as I explained in this paper: "An Illuminating Counterexample, American Mathematical Monthly, volume 110, number 3 (March, 2003), pages 234-238.