0
$\begingroup$

My statistics note states that the variance of the empirical distribution is $v= \sum_{i=1}^{n}(x_i-\bar x )^2\frac {1} {n}$ which the author then re-writes as $v= \sum_{i=1}^{n}x_i^2 (\frac {1} {n}) - \bar x^2$. How is this achieved?

3 Answers 3

3

It's just algebra

$$ \begin{aligned} v&=\frac{1}{n}\sum_{i=1}^n(x_i-\bar{x})^2=\frac{1}{n}\sum_{i=1}^n\left(x_i^2-2x_i\bar{x}+\bar{x}^2\right)\\ &=\frac{1}{n}\sum_{i=1}^nx_i^2-2\bar{x}\frac{1}{n}\sum_{i=1}^nx_i+\bar{x}^2\frac{1}{n}\sum_{i=1}^n1\\ &=\frac{1}{n}\sum_{i=1}^nx_i^2-2\bar{x}^2+\bar{x}^2\\ &=\frac{1}{n}\sum_{i=1}^nx_i^2-\bar{x}^2 \end{aligned} $$

  • 0
    Ok thanks a lot.2017-01-24
1

$$\frac{1}{n}\sum_{i=1}^n(x_i-\bar{x})^2=\frac{1}{n}\left(\sum_{i=1}^n x_i^2-2\sum_{i=1}^n x_i\cdot \bar{x}+\sum_{i=1}^n \bar{x}^2\right)=\frac{1}{n}\left(\sum_{i=1}^nx_i^2-2\bar{x}\sum_{i=1}^nx_i+\bar{x}^2\sum_{i=1}^n 1\right).$$

Hence

$$ \frac{1}{n}\sum_{i=1}^n(x_i-\bar{x})^2=\frac{1}{n}\sum_{i=1}^n x_i^2-2\bar{x}\cdot\bar{x}+\frac{1}{n}\cdot n \bar{x}^2=\frac{1}{n}\sum_{i=1}^n x_i^2-\bar{x}^2. $$

1

A useful idea is to consider sample mean, variance, moments, quantiles etc as the mean, variance etc with respect to the empirical measure. Namely, the sample mean $\overline x$ is the expectation of empirical measure, which assigns probabilities $1/n$ to the sampled values $\{x_1,x_2,\dots,x_n\}$: $$ \overline x = \hat{\mathbb{E}}_n X, $$ where $\hat{\mathbb{P}}_n(X = x_k) = 1/n$, $k=1,\dots,n$. Similarly, the sample variance $$ \hat\sigma^2 = \frac1n \sum_{k=1}^n (x_k-\overline x)^2 = \hat{\mathbb{E}}_n (X -\hat{\mathbb{E}}_n X)^2 = \hat{\operatorname{var}}_n(X) $$ is the variance of $X$ with respect to $\hat{\mathbb{P}}_n$. Thus, the formula $$ \hat\sigma^2 = \overline{x^2} - (\overline x)^2 = \hat{\mathbb{E}}_n X^2 - (\hat{\mathbb{E}}_n X)^2 $$ is a particular case of the general formula for variance.