6
$\begingroup$

Assume that I have $n$ variables that are each $X_i \sim \text{Beta}(\alpha, 1)$ distributed (with the same $\alpha$, i.i.d.).

Is there anything known about the distribution of the sum $Y=\sum_i X_i$? I've tried to understand the works of Pham-Gia, but couldn't entirely grasp them. He seems to use some modified generalized Beta $B^*$.

An approximation is also fine with me, as long as I can compute some quantiles. I'm mostly interested in quantiles.

  • 0
    Do you assume independent identically distributed variables $X_i$ ?2011-11-25
  • 1
    Yes, in the title I wrote "i.i.d." explicitly.2011-11-25

2 Answers 2

3

I will assume $X_i$ independent in this post. To get a feel for the problem, recall the special case of uniform distribution, corresponding to $\alpha=1$, i.e. $\mathrm{Beta}(1,1) \stackrel{d}{=} U(0,1)$.

The sum of $n$ iid uniform distribution was studied by J.O. Irwin and P. Hall, and the result is known as Irwin-Hall distribution, aka uniform sum distribution.

Already for $n=3$ the distribution density of the sum of three standard uniform variables approximates normal quite well:

enter image description here

The same approximation will work well for larger values of $n$ in your case as well. To write it out we need to compute mean and variance of the sum: $$ \mu_n = \mathbb{E}(\sum_{k=1}^n X_i) = n \frac{\alpha}{\alpha+1} \qquad \sigma_n^2 = \mathbb{Var}(\sum_{k=1}^n X_i) = \sum_{k=1}^n \mathbb{Var}(X_i) = \frac{n \alpha}{\alpha+2} \frac{1}{(\alpha+1)^2} $$

Thus the quantile function approximation is: $$ Q_n(q) \approx n \frac{\alpha}{\alpha+1} + \frac{1}{\alpha+1} \sqrt{ \frac{n \alpha}{\alpha+2} } Q_{\mathcal{N}(0,1)}(q) $$

For $n=2$ CDF can be worked out exactly, and can be inverted using numerical algorithms:

enter image description here


Added: The normal approximation can be truncated to $(0,n)$ interval to improve accuracy: $$ Q_{Y_n}(q) = \mu_n + \sigma_n Q_{N(0,1)}( (1-q) \Phi(-\mu_n/\sigma_n) + q \Phi((n-\mu_n)/\sigma_n) ) $$

  • 0
    "The same approximation will work well for larger values of n in your case as well" But probably not so well, is $\alpha$ is very different from 1. I would go this route, though, is $n$ is large.2011-11-25
  • 0
    I need $\alpha$ to be different from 1, and $n$ to be larger than $2$. I'm particularly interested in $\alpha = .5$ (as this occurs in my situation for Euclidean distance, a prime example of an $L_p$ norm - $1/p$ is actually my equation for $\alpha$), and I'm interested in the development of this distribution when I increase $n$. Of course I do expect it to approximate a Gaussian distribution for large values of $n$; at the same time I know it will never be _exactly_ Gaussian, as negative values are not possible here.2011-11-25