3
$\begingroup$

We've learned this algorithm in class but I'm not sure I've fully understood the correctness of this approach.

It is known as Welford's algorithm for the sum of squares, as described in: Welford, B.P., "Note on a Method for Calculating Corrected Sums of Squares and Products", Technometrics, Vol. 4, No. 3 (Aug., 1962), pp. 419-420

The two-pass algorithm to estimate the sample deviation is simple. We first estimate the mean by one pass and then calculate the standard deviation. In short, it is $s_n^2 = \frac{1}{n-1} \sum_{i=1}^n (Y_i - \hat{Y})^2$.

The one-pass approach has three steps.

1) $v_1 = 0$

2) $v_k = v_{k-1} + \frac{1}{k(k-1)} (\sum_{i=1}^{k-1} Y_i - (k-1) Y_k)^2 (2 \leq k \leq n)$

3) $s_n^2 = \frac{v_n}{n-1}$.

Can someone help me understand how this approach works?

  • 2
    This isn't quite "one-pass" because step (2) contains a sum from $i=1$ to $k-1$. Better to say something like $m_1 = Y_1$, $m_k = m_{k-1} + Y_k$, $v_k = v_{k-1} + \frac{1}{k(k-1)} (m_{k-1} - (k-1)Y_k)^2$2012-02-29

1 Answers 1