7
$\begingroup$

Assume that I have $n$ variables that are each $X_i \sim \text{Beta}(\alpha, 1)$ distributed (with the same $\alpha$, i.i.d.).

Is there anything known about the distribution of the sum $Y=\sum_i X_i$? I've tried to understand the works of Pham-Gia, but couldn't entirely grasp them. He seems to use some modified generalized Beta $B^*$.

An approximation is also fine with me, as long as I can compute some quantiles. I'm mostly interested in quantiles.

  • 1
    Yes, in the title I wrote "i.i.d." explicitly.2011-11-25

2 Answers 2

4

I will assume $X_i$ independent in this post. To get a feel for the problem, recall the special case of uniform distribution, corresponding to $\alpha=1$, i.e. $\mathrm{Beta}(1,1) \stackrel{d}{=} U(0,1)$.

The sum of $n$ iid uniform distribution was studied by J.O. Irwin and P. Hall, and the result is known as Irwin-Hall distribution, aka uniform sum distribution.

Already for $n=3$ the distribution density of the sum of three standard uniform variables approximates normal quite well:

enter image description here

The same approximation will work well for larger values of $n$ in your case as well. To write it out we need to compute mean and variance of the sum: $ \mu_n = \mathbb{E}(\sum_{k=1}^n X_i) = n \frac{\alpha}{\alpha+1} \qquad \sigma_n^2 = \mathbb{Var}(\sum_{k=1}^n X_i) = \sum_{k=1}^n \mathbb{Var}(X_i) = \frac{n \alpha}{\alpha+2} \frac{1}{(\alpha+1)^2} $

Thus the quantile function approximation is: $ Q_n(q) \approx n \frac{\alpha}{\alpha+1} + \frac{1}{\alpha+1} \sqrt{ \frac{n \alpha}{\alpha+2} } Q_{\mathcal{N}(0,1)}(q) $

For $n=2$ CDF can be worked out exactly, and can be inverted using numerical algorithms:

enter image description here


Added: The normal approximation can be truncated to $(0,n)$ interval to improve accuracy: $ Q_{Y_n}(q) = \mu_n + \sigma_n Q_{N(0,1)}( (1-q) \Phi(-\mu_n/\sigma_n) + q \Phi((n-\mu_n)/\sigma_n) ) $

  • 0
    I need $\alpha$ to be different from 1, and $n$ to be larger than $2$. I'm particularly interested in $\alpha = .5$ (as this occurs in my situation for Euclidean distance, a prime example of an $L_p$ norm - $1/p$ is actually my equation for $\alpha$), and I'm interested in the development of this distribution when I increase $n$. Of course I do expect it to approximate a Gaussian distribution for large values of $n$; at the same time I know it will never be _exactly_ Gaussian, as negative values are not possible here.2011-11-25