1
$\begingroup$

I have a list of values of a random variable $x \in \mathbb R$. Is it possible to find the varience $\overline{(x - \overline x)^2}$ without computing the mean $\overline x$ first? That is to process the list only once.

3 Answers 3

3

You can use that the variance is $\overline{x^2} - \overline {x}^2$, which takes only one pass (computing the mean and the mean of the squares simultaneously), but can be more prone to roundoff error if the variance is small compared with the mean.

  • 0
    And should have stated from wikipedia http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance At first I thought it would be impossible and there would be no such algorithm.2012-01-24
3

How about sum of squared pairwise differences ? Indeed, you can check by direct computation that

$ 2v_X = \frac{1}{n(n-1)}\sum_{1 \le i < j \le n}(x_i - x_j)^2. $

Disclaimer: This is not meant to be efficient, just another way to represent the variance, without first computing the mean.

2

The sample variance with mean is calculated as: $ v_{X}=\frac{1}{n-1}\sum_{i=1}^{n}(x_{i}-\overline{x})^{2} $ And the sample variance without mean as: $ v_{X}=\frac{1}{n-1}\left [ \sum_{i=1}^{n}x_{i}^{2}-\frac{1}{n}\left ( \sum_{i=1}^{n}x_{i} \right ) ^{2}\right ] $