3
$\begingroup$

Possible Duplicate:
Motivation behind standard deviation?

In statistics very often you see something of the sort: $ \textrm{quantity}=\sqrt{\frac {\sum(x-\mu)^2} {N}} $ to measure things like standard deviation ($\mu$ is the mean here).

It seems that just making an absolute value of the difference will give us a pretty good measure of the same thing: $ \textrm{quantity}=\frac {\sum{\Bigl|x-\mu\Bigr|}} {N} $

How did we end up with those squares?

  • 1
    Be sure to check out http://stats.stackexchange.com/questions/118/why-square-the-difference-instead-of-taking-the-absolute-value-in-standard-devia2012-12-29

3 Answers 3

2

Simply because it is easier to work with analytically.

(One might ask similar questions about why we use least squares instead of least fourth powers.)

Note that when the foundations of statistics and probability were laid, there were not computers (in the modern sense). However, some people are now using absolute value approaches in lieu of squares/square roots because we are finally able to do so with modern computing power.

1

The mean $\mu$ of a set of values $\{x_1,...x_n\}$ is the unique value of $y$ which minimizes $\sum_{i=1}^n (x_i-y)^2$ (or equivalently, $\frac{\sum_{i=1}^n (x_i-y)^2}{N}.)$

The mean is the value which minimizes the total squared deviation from it, and the standard deviation is the square root of that deviation.

So the squares are built into the definition of the mean. You could redefine the mean to minimize $ \sum_{i=1}^n |x_i-y|$ but one immediate issue is that there is often not a unique solution (for example, with the values $\{1,3,4,5\}$, all $y$-values with $3\le y \le 4$ minimize the sum).

1

When you sample $n$ values of a random variable $X$, you have $n$-values $x_1,x_2,\ldots,x_n$. You've just sampled one vector in $\Bbb{R}^n$: $\vec{x}=\langle x_1,x_2,\ldots,x_n\rangle$. The standard formula for standard deviation applies the Euclidean distance metric to this vector and its mid-vector: $\langle\overline{x},\overline{x},\ldots,\overline{x}\rangle$, then dividing by $n$ as a way to account for largness due simply to large dimension.

What is your favorite distance metric in $\mathbb{R}^n$? If it is the usual Euclidean metric, then the standard formula for standard deviation arises. If it is the taxi cab metric, then you could employ the formula that you suggest. But one answer to the question of "why the squares" is that the Euclidean distance metric is generally the natural one.