4
$\begingroup$

Suppose that we have $n$ independent random variables, $x_1,\ldots,x_n$ such that each $x_i$ takes value $a_i$ with success probability $p_i$ and value $0$ with failure probability $1-p_i$ ,i.e.,

\begin{align} P(x_1=a_1) & = p_1,\ P(x_1=0)= 1-p_1 \\ P(x_2=a_2) & = p_2,\ P(x_2=0) = 1-p_2 \\ & \vdots \\ P(x_n=a_n) & = p_n,\ P(x_n=0)=1-p_n \end{align}

where $a_i$'s are positive Real numbers.

What would be the CDF of the sum of these random variables? That is, what would be $P(x_1+\cdots+x_n\le k)$ ? and how can we find it in an efficient way?

  • 0
    I edited the question and deleted the part regarding Poisson binomial distribution since it was not relevant to my question.2012-12-20

3 Answers 3

4

This will be CDF of discrete distribution with certain number of values $s_1,...,s_r$ where $r$ is number of possible distinct sums among all subsets $\{a_{i_1},...,a_{i_m}\}$. It is not possible to evaluate $r$ in general case because this number solely depends on particular values $a_1,...,a_n$ and can vary from $n+1$ to $2^n$. The minimal $r=n+1$ is represented by case of $a_1=a_2=...=a_n$ while maximal $r=2^n$ is always achieved when ascending reordered set $\{a_{(1)},...,a_{(n)}\}$ complies with the condition: $\forall i: \sum\limits_{1\le j. Example of such maximal case could be $a_i=2^{i-1}$. Note, that failing this condition does not necessarily mean $r<2^n$. For example, for $a_i=ln(prime(i))$ the condition above fails, however all possible sums $a_{i_1}+...+a_{i_m}$ are still distinct, causing $r=2^n$.

We can always consider set $\{s_1,...,s_r\}$ as ordered, then $s_1=0$ and $s_r=\sum\limits_i a_i$. Then $P(s_1)=P(0)=\prod\limits_i (1-p_i)$ as all $x_i=0$; $P(s_r)=P\left(\sum\limits_i a_i\right)=\prod\limits_i p_i$ as all $x_i=a_i$; and $P(s_k)=\sum\limits_{i_1,...,i_m|a_{i_1}+...+a_{i_m}=s_k}\left(\prod\limits_{j\in\{i_1,...,i_m\} } p_j\times\prod\limits_{j\in[1;n]\setminus\{i_1,...,i_m\} } (1-p_j)\right)=\\P(0)\times\sum\limits_{i_1,...,i_m|a_{i_1}+...+a_{i_m}=s_k}\left(\prod\limits_{j\in\{i_1,...,i_m\} }\frac{p_j}{1-p_j}\right)$ as we can eliminate the second product of complemented set $[1;n]\setminus\{i_j\}$ by using the fact that $P(0)$ contains all $(1-p_i)$ for the whole set $[1;n]$.

  • 0
    thank you a lot for your clarification.2013-02-07
3

This answer is an attempt at providing an answer to a previous version of the question in which the $x_i$ were independent Bernoulli random variables with parameters $p_i$.

$P\{\sum_{i=1}^n x_i = k\}$ equals the coefficient of $z^k$ in $(1-p_1+p_1z)(1-p_2+p_2z)\cdots(1-p_n+p_nz)$. This can be found by developing the Taylor series for this function. It is not much easier than grinding out the answer by brute force.

0

My solution to this problem is as follows

$P(x_1+⋯+x_n≤k)=∑ ^ n _{m=0}∑ _{A\epsilon T_m } \Pi_{i \epsilon A} p_i \Pi_{i \epsilon A^c } (1-p_i) $

where

$T_m=${{$r_1$,...,$r_m$}| $ a_{r_1}+...+a_{r_m}<=k$ }

and $A^c$ is the complement of $A$.

  • 0
    SE "ate" curly braces where I said "set" and for some unknown reason it refuses any edits after 5 mins. So please use your imagination to put the braces and read $\{T_m\}$ instead of ${T_m}$2013-02-06