5
$\begingroup$

I've been working on finding the probability for the event, that the sum of $n$ independent random variables are less than $s$, when they are evenly distributed on $[0,1)$.

I've used the law of total probability to derive the formula:

$P(S

$f_S(s) = \frac{1}{(n-1)!}\sum_{k=0}^{\lfloor s\rfloor}(-1)^k\binom{n}{k}(s-k)^{n-1}$

Now I'm not very strong in probability, but I believe you could express the same thing using the normal distribution. Or at least approximate it. Is that so? In that case how would you express it, and how would you normally determine the approximation error?

  • 0
    You can't express the same thing as a normal distribution. One way to see this is to note that it's not a smooth function; another is to note that it takes zero values. What you can do is to find a normal distribution with the same mean and variance, which is easy, since the mean and variance are additive, so are just $n$ times the values for the individual distributions. For large $n$, that normal distribution will become a good approximation of the distribution; see http://en.wikipedia.org/wiki/Central_limit_theorem.2011-04-06
  • 0
    @joriki: I think the function has derivatives of all orders, doesn't it? n of them being non zero.2011-04-06
  • 0
    The normal distribution is about the summing an infinite number of random variables, right? So as you say, you should be able to approximate my problem using it. That's what I want to do.2011-04-06
  • 0
    In the end I'd like to know some more about the distribution I've found, and whether it has a name.2011-04-06
  • 0
    No -- if it had derivatives of all orders and only $n$ of them were non-zero, then it would have to be a polynomial -- whereas whenever $\lfloor s \rfloor$ changes (i.e. at the integers), it becomes a different polynomial.2011-04-06
  • 0
    I'm not sure what you mean by "you should be able to approximate". I wrote precisely how you can do so.2011-04-06
  • 1
    I don't think it has any name other than "the sum of $n$ independent random variables with uniform distribution on [0,1]".2011-04-06
  • 1
    For $n=2$, it's called a triangular distribution; see http://en.wikipedia.org/wiki/Triangular_distribution -- the graph there also shows you that it's not a smooth function.2011-04-06

2 Answers 2

4

The distribution of the sum of $n$ IID random variables uniform on $[0,1]$ is also known as the Irwin-Hall distribution or the uniform sum distribution.

The normal approximation is to approximate a distribution with a normal distribution which has the same mean $(\frac{n}{2})$ and standard deviation $(\sqrt{\frac{n}{12}})$.

One way to bound the error in a normal approximation is to use the Berry-Esséen theorem which gives effective bounds on the errors in a normal approximation in terms of the third central moment of a distribution.

$$|P(S<\frac{n}{2}+\sigma \sqrt{n} x) - \Phi(x)| \le \frac {C \rho}{\sigma^3\sqrt{n}}$$

where $\Phi$ is the cumulative distribution function for a standard normal distribution, $C$ can be taken to be $1/2$, $\sigma = \sqrt{\frac{1}{12}}$, and $\rho$ is the $3$rd central moment of $U(0,1)$, $1/32$.

I think the extreme cases for the Berry-Esseen estimate do not look much like the uniform distribution, so it could be that you can get better estimates for the accuracy of a normal approximation.

  • 0
    Very interesting. I think the formula on wikipedia to evaluate `fX` is less efficient than mine?2011-04-07
7

For an explicit expression for the denisty function see MathWorld: Uniform Sum Distribution, for example.

EDIT:

The density function $f_n$ of the sum $U_1 + \cdots + U_n$, where $U_i$ are independent uniform$(0,1)$ variables, is given, according to the MathWorld link above, by $$ f_n (x) = \frac{1}{{2(n - 1)!}}\sum\limits_{k = 0}^n {( - 1)^k {n \choose k}(x - k)^{n - 1} {\mathop{\rm sgn}} (x - k)},\;\; 0 < x < n. $$

Normal approximation (general setting). Suppose that $X_1,X_2,\ldots$ is a sequence of independent and identically distributed random variables with mean $\mu$ and variance $\sigma^2$. Define $S_n = \sum\nolimits_{i = 1}^n {X_i }$. Then, by the central limit theorem, $$ \frac{{S_n - n\mu }}{{\sigma \sqrt n }} \to {\rm N}(0,1), $$ as $n \to \infty$. This means that, for any $x \in \mathbb{R}$, $$ {\rm P}\bigg(\frac{{S_n - n\mu }}{{\sigma \sqrt n }} \le x \bigg) \to \Phi (x), $$ where $\Phi$ denotes the distribution function of the ${\rm N}(0,1)$ distribution. Thus, $$ {\rm P}(S_n \le n\mu + \sigma \sqrt n x) \to \Phi (x). $$ Note that the first convergence above gives rise to the approximation $$ S_n \approx n \mu + \sigma \sqrt{n} Z, $$ where $Z \sim {\rm N}(0,1)$.

Back to your specific question, you just need to substitute the expectation and variance of a uniform$(0,1)$ random variable. In the context of normal approximation, consider also the four plots given in the MathWorld link above.

  • 0
    Thank you for the link and the general usage of the Normal distribution.2011-04-07
  • 0
    Are there any advantages on the MathWorld pdf compared to mine? It seams slightly more complex and a bit less efficient.2011-04-07
  • 0
    @Thomas Ahle: Indeed, your formula looks more elegant.2011-04-07