7
$\begingroup$

I have distilled an error analysis problem into the following:

I have a multinomial distribution, $X$, consisting of $n$ independent trials where each trial takes on the values $\{0,1,\ldots,k-1\}$ with uniform probability of $\frac{1}{k}$.

Now I am interested in finding the distribution of the sum of all the outcomes in the $n$ trials, but not sure how to approach the problem. I suspect it has something to do with partition functions. Would be glad if the relevant probability distribution function in MATLAB could also be pointed out.

2 Answers 2

5

Since you mentioned in a comment that $n=4$ in your case, here's a way to derive the distribution for small values of $n$.

The number of ways of writing $m$ as a sum of $n$ values from $0$ to $k-1$ is the coefficient of $x^m$ in

$ (1+x+\dotso+x^{k-1})^n=\left(\frac{1-x^k}{1-x}\right)^n=(1+x+x^2+\dotso)^n\sum_{j=0}^n\binom nj(-x^k)^j\;. $

Thus you just have to place the binomial coefficients in a sequence at distances of $k$ and then sum the sequence $n$ times; e.g. for $n=4$ and $k=3$:

$ \begin{array}{rrrrrrrrr} 1&0&0&-4&0&0&6&0&0&-4&0&0&1\\ 1&1&1&-3&-3&-3&3&3&3&-1&-1&-1\\ 1&2&3&0&-3&-6&-3&0&3&2&1\\ 1&3&6&6&3&-3&-6&-6&-3&-1\\ 1&4&10&16&19&16&10&4&1 \end{array} $

Then dividing through by the total number $k^n=81$ of possibilities gives you the probabilities for the values of the sum.

  • 0
    @RamonCrehuet: I don't see a better solution than what has already been written in response to the question you linked to.2016-02-09
4

If you have $n\geq 30$, you can apply the central limit theorem.

Assume $u_1,u_2,\cdots,u_n\sim {\cal{U}}[0,k-1]$ i.i.d., then we know for each $u_k$ its mean and variance are $\mu={(k-1)\over 2} {\rm\,and\,} \sigma^2 = {k^2-1\over 12}.$ Now construct the summation random variable $X=\sum_{k=1}^n u_k$ Then you can find the corresponding mean and variance of r.v. $X$ as $\mu_X = n\mu {\rm\,and\,} \sigma^2_X = n\sigma^2$ So if $n$ is sufficiently large, then the distribution of $X$ can be considered as a Gaussian, i.e. $X\sim {\cal N}(n\mu,n\sigma^2)$ Based on this information, I think it is easy for your to use Matlab to compute the CDF of $X$. The last step is to get the discrete PMF from this continuous CDF, which can be thought of the following form $Pr(X=m) = CDF(m+.5)-CDF(m-.5)$

  • 0
    Unfortunately $n=4$ for my error analysis. Nevertheless the above-mentioned result is noteworthy.2012-10-31