0
$\begingroup$

Say the $N$ people in Springfield either have 2 corgis (with probability $p_1$), a corgi and a shizu (with probability $p_2$), or 2 shizus (with probability $p_3$). Let $J$ be the number of people with 2 corgis, $K$ be the number of people with a corgi and a shizu, and $L$ be the number of people with 2 shizus. What is the distribution of the number of corgis ($C$) out of a total of $2N$ dogs in Springfield?

$C = 2J + K$. The distribution of people with 2 corgis is $Binomial(N, p_1)$, and the distribution of people with a corgi and a shizu is $Binomial(N, p_2)$. The sum of binomial random variables is also binomial, so I have a hunch that $C$ should be binomial.

However, how do I find the probability associated with $C$?

  • 0
    "The sum of binomial random variables is also binomial" is true for constant $p$. You don't have that.2012-11-04

1 Answers 1

3

There are at least two reasons to guess that $C$ is not binomial. First, since $J+K\leqslant N$ almost surely while $J$ and $K$ both can take every value between $0$ and $N$, $J$ and $K$ are not independent. Second, the result that the sum of two binomials is binomial holds only for sums, not for linear combinations.

To compute the distribution of $C$, call $C_n$ the number of corgis owned by individual $n$, then $ C=\sum\limits_{n=1}^NC_n. $ The random variables $(C_n)_{1\leqslant n\leqslant N}$ are independent and $\mathbb P(C_n=i)=p_{3-i}$ for $i$ in $\{0,1,2\}$. The generating function of each $C_n$ is $ \mathbb E(s^{C_n})=p_3+p_2s+p_1s^2, $ hence the distribution of $C$ is characterized by the identity $ \mathbb E(s^C)=(p_3+p_2s+p_1s^2)^N. $ In the simple case when $p_1=p_2=p_3=\frac13$, one gets $\mathbb E(s^{C_n})=\frac13(1+s+s^2)$ hence $ \mathbb E(s^C)=\left(\frac{1+s+s^2}3\right)^N. $ If $N=2$, for example, the distribution of $C$ puts weight $\frac19$, $\frac29$, $\frac39$, $\frac29$ and $\frac19$ on $0$, $1$, $2$, $3$ and $4$ respectively, which is not binomial.

On the other hand, expectations and variances are easy since $\mathbb E(C)=N\mu$ where $\mu=\mathbb E(C_1)$, hence $\mu=p_2+2p_1$, and, by independence, $\mathrm{var}(C)=N\sigma^2$ where $\sigma^2=\mathrm{var}(C_1)$, hence $\sigma^2=p_2+4p_1-\mu^2$.