1
$\begingroup$

There are $K$ items indexed $X_1, X_2, \ldots, X_K$ in the pool. Person A first randomly take $K_A$ out of these $K$ items and put them back to the pool. Person B then randomly take $K_B$ out of these $K$ items. What is the expectation of items that was picked by B but not taken by A before?

Assuming $K_A \geq K_B$, the formula I get is,

\begin{equation} E = \sum_{i=1}^{K_B} i \frac{{{K}\choose{K_A}}{{K_A}\choose{K_B - i}}{{K - K_A}\choose{i}}}{{{K}\choose{K_A}}{{K}\choose{K_B}}} \end{equation}

When $K_B > K_A$, I can derive similar formulas. I am wondering if there is a way to simplify this formula? Thanks.

2 Answers 2

3

I would prefer to index the objects by the numbers $1,2,\dots,K$.

Define the indicator random variable $W_i$ by $W_i=1$ if item $i$ was taken by $B$ but not taken by $A$ before, and by $W_i=0$ otherwise. Let $p=\Pr(W_i=1)$. Then $p=\frac{K_B}{K}\left(1-\frac{K_A}{K}\right).$ This is because the events "item labelled $i$ was taken by $B$" and "item $i$ was taken by $A$" are independent. Let random variable $Y$ be the number of objects taken by $B$ that had not been taken by $A$. Then $Y=W_1+W_2+\cdots+W_K$, and therefore by the linearity of expectation $E(Y)=E(W_1+W_2+\cdots+W_K)=E(W_1)+E(W_2)+\cdots+E(W_K)=Kp.$ Note that the $W_i$ are in general not independent. But the linearity of expectation does not require independence.

Remark: There is undoubtedly a way to obtain the expectation by manipulating a suitable expression that involves binomial coefficients. It is, however, very much the inefficient way to compute the expectation.

  • 0
    Thanks for this simpler solution. This is very helpful for me.2012-10-01
1

André's solution is the best one, of course.

But for the sheer fun of it, let's calculate the sum \begin{equation} E = \sum_{i=1}^{K_B} i \frac{{{K}\choose{K_A}}{{K_A}\choose{K_B - i}}{{K - K_A}\choose{i}}}{{{K}\choose{K_A}}{{K}\choose{K_B}}} \end{equation} First, cancel the common factor $E = \sum_{i} i \frac{{K_A\choose K_B - i}{{K - K_A}\choose{i}}}{{{K}\choose{K_B}}}.$

The absorption identity (4) lets us get rid of the factor "$i$"
$E = \sum_{i} \frac{{K_A\choose K_B - i}(K-K_A) {{K - K_A-1}\choose{i-1}}}{{{K}\choose{K_B}}},$ so that
$E = {(K-K_A)\over {K\choose K_B}} \sum_{i} {K_A\choose K_B - i} {K - K_A-1 \choose i-1}.$

Using Vandermonde's convolution (7a) we get $E = {(K-K_A)\over {K\choose K_B}} {K-1 \choose K_B - 1},$ and using the absorption identity once more we arrive at $E = (K-K_A)\,{K_B\over K}.$

  • 0
    Thank you very much for the whole solution process. Both Andre's and your solution are very helpful.2012-10-01