Suppose we have the random variables $X_1, \ldots, X_n$ that have Bernoulli distributions with the (possibly different) probabilities $p_1, \ldots, p_n$. For example, $X_1$ = 1 with probability $p_1$ and 0 with probability $1-p_1$. Is there an efficient way to compute $$E\left[\frac{1}{1+\sum_iX_i}\right]$$ in polynomial time in $n$? If not, is there an approximate solution?
Efficient computation of $E\left[\left(1+X_1+\cdots+X_n\right)^{-1}\right]$ with $(X_i)$ independent Bernoulli with varying parameter
3 Answers
An approach is through generating functions. For every nonnegative random variable $S$, $$ E\left(\frac1{1+S}\right)=\int_0^1E(t^S)\mathrm{d}t. $$ If $S=X_1+\cdots+X_n$ and the random variables $X_i$ are independent, $E(t^S)$ is the product of the $E(t^{X_i})$. If furthermore $X_i$ is Bernoulli $p_i$, $$ E\left(\frac1{1+S}\right)=\int_0^1\prod_{i=1}^n(1-p_i+p_it)\mathrm{d}t. $$ This is an exact formula. I do not know how best to use it to compute the LHS efficiently. Of course one can develop the integrand in the RHS, getting a sum of $2^n$ terms indexed by the subsets $I$ of $\{1,2,\ldots,n\}$ as $$ E\left(\frac1{1+S}\right)=\sum_I\frac1{|I|+1}\prod_{i\in I}p_i\cdot\prod_{j\notin I}(1-p_j). $$ But it might be more useful to notice that $$ \prod_{i=1}^n(1-p_i+p_it)=\sum_{k=0}^n(-1)^k\sigma_k(\mathbf{p})(1-t)^k, $$ where $\sigma_0(\mathbf{p})=1$ and $(\sigma_k(\mathbf{p}))_{1\le k\le n}$ are the symmetric polynomials of the family $\mathbf{p}=\{p_i\}_i$. Integrating with respect to $t$, one gets $$ E\left(\frac1{1+S}\right)=\sum_{k=0}^n(-1)^k\frac{\sigma_k(\mathbf{p})}{k+1}. $$ The computational burden is reduced to the determination of the sequence $(\sigma_k(\mathbf{p}))_{1\le k\le n}$.
Note 1 The last formula is an integrated version of the algebraic identity stating that, for every family $\mathbf{x}=\{x_i\}_i$ of zeroes and ones, $$ \frac1{1+\sigma_1(\mathbf{x})}=\sum_{k\ge0}(-1)^k\frac{\sigma_k(\mathbf{x})}{k+1}, $$ truncated at $k=n$ since, when at most $n$ values of $x_i$ are non zero, $\sigma_k(\mathbf{x})=0$ for every $k\ge n+1$. To prove the algebraic identity, note that, for every $k\ge0$, $$ \sigma_1(\mathbf{x})\sigma_k(\mathbf{x})=k\sigma_k(\mathbf{x})+(k+1)\sigma_{k+1}(\mathbf{x}), $$ and compute the product of $1+\sigma_1(\mathbf{x})$ by the series in the RHS. To apply this identity to our setting, introduce $\mathbf{X}=\{X_i\}_i$ and note that, for every $k\ge0$, $$ E(\sigma_k(\mathbf{X}))=\sigma_k(\mathbf{p}). $$ Note 2 More generally, for every suitable complex number $z$, $$ \frac1{z+\sigma_1(\mathbf{x})}=\sum_{k\ge0}(-1)^k\frac{\Gamma(k+1)\Gamma(z)}{\Gamma(k+1+z)}\sigma_k(\mathbf{x}). $$
-
0I've the feeling the OP can't do better than just multiplying out that polynomial integrand and integrating termwise... (barring "special" values of $p_i$ of course.) – 2011-05-12
-
0Naively, the polynomial can be evaluated in $O(n^2)$ elementary operations (perhaps this can be done in $O(n \log^2(n))$ or even $O(n \log(n))$?), and if $p_i = \frac{a_i}{b_i}$ with $a_i$ and $b_i$ relatively prime, $d$ being the maximum number of bits needed to represent $a_i$ or $b_i$, will give a total cost of $O(n^2 d \log(d))$ ($O(n \log^{\alpha}(n) d \log(d))$?). @Didier, is this a standard result of generating functions? – 2011-05-12
-
0@user4143: Yes, this is simply the integrated version of the expression of $1/(s+1)$ as the integral from $0$ to $1$ of $t^s$. – 2011-05-12
-
0Thanks, so it looks like the computation can be done in polynomial time in n after all. – 2011-05-12
-
0@user4143: Actually how do you compute the polynomial in $O(n^2)$ time? For instance, if $n$ = 3, then we need to compute $\frac{p_1p_2p_3}{4}$ + $\frac{(1-p_1)p_2p_3 + p_1(1-p_2)p_3 + p_1p_2(1-p_3)}{3} + \frac{(1-p_1)(1-p_2)p_3 + (1-p_1)p_2(1-p_3) + p_1(1-p_2)(1-p_3)}{2} + (1-p_1)(1-p_2)(1-p_3)$, which seems to require $O(2^n)$ elementary operations. – 2011-05-13
-
1Ah I see your point, so we can use dynamic programming and generate the coefficients (similar to generating Pascal's triangle for binomial coefficients). In my example, I would start from (p_1, 1-p_1), then generate (p_1p_2, (1-p_1)p_2 + p_1(1-p_2), (1-p_1)(1-p_2)) from (p_1, 1-p_1) in a linear fashion, and so on. – 2011-05-13
-
0@Steven: See the expanded version. Now one simply computes the symmetric polynomials of the parameters $p_i$. – 2011-06-07
Regarding approximations for large N (say, $N>10$) , though perhaps this is not you are looking for... A general approach for approximating the expectation of a function of a random variable $y=g(x)$ is to make a Taylor expansion around the mean (of $x$) and taking expectations.
$ \displaystyle y \approx g(\mu_x) + g'(\mu_x) (x-\mu_x) + \frac{1}{2}g''(\mu_x) (x-\mu_x)^2 +\cdots \Rightarrow$
$ \displaystyle \mu_y \approx g(\mu_x) + \frac{1}{2!}g''(\mu_x) \; m_{2,x} + \frac{1}{3!} g'''(\mu_x) \; m_{3,x} \cdots $
where $m_{k,x}$ is the k-th centered moment of $x$
In our case,the second order approximation gives
$ \displaystyle E(y)= E \left[ \frac{1}{1+\sum x} \right] \approx y_0 + y_0^3 \sum \sigma^2_i$
where $ \displaystyle y_0 = \frac{1}{1 + \sum E(x_i)} = \frac{1}{1 + \sum p_i}$
(first order aproximation) and
$\sigma^2_i = p_i (1-p_i)$
Experimentally, I get a relative error that rarely exceeds 0.01, with random (uniform) $p_i$ and $N=15$. With $N=30$, it's about 0.001
-
0I am actually very interested in large n values. I wish I could choose this post as an answer as well. – 2011-05-12
If you have numerical values for the p's and want a numerical value for the expectation, a simple O(n*n) approach is to compute the pdf of the sum. If S_k is the sum of the first k of the X's then S_k takes values in {0..k}; if its pdf is held in the array c, then the pdf of S_k+1 can be computed in c like this (where p is the parameter for X_k+1):
c[k+1] = p*c[k] for j=k .. 1 c[j] = p*c[j] + (1-p)*c[j-1] c[0] = (1-p)*c[0]
A wee C program based on this takes around 3.25 seconds (on an ordinary pc) to compute the expectation for n = 30000