0
$\begingroup$

I have the following problem:

"Suppose you have a universe of $N$ distinct objects, and you observe $k$ of them, possibly with repetition. The order in which the objects are observed does not matter. $k < \frac{1}{10}N \ll 2^{128}$. What is the expected number of repetitions, counting pairs only? In other words, do not count triples, quadruples, etc."

I get $E(R) = \sum_{i=1}^{\lfloor\frac{k}{2}\rfloor}i\cdot P(R=i) = \sum_{i=1}^{\lfloor\frac{k}{2}\rfloor}i\cdot\binom{N}{i}\cdot\left(\sum_{l=0}^{\lfloor\frac{k}{2}\rfloor-i+1}(-1)^l\frac{\binom{N+k-3i-2l-1}{k-2i-2l}\binom{N-i}{l}}{\binom{N+k-1}{k}}\right)$

If it helps, I can provide the derivation for this; I referred to R. Stanley's "Enumerative Combinatorics" for a hint on how to count weak compositions of an integer with exactly $N$ parts, each part less than some integer $j$. I re-indexed the sum to go over a single index $l$ as well. The outer sum only goes up to $k/2$ because it's impossible to get any more repetitions than that.

Assuming all that is correct, I am trying to get an approximation for this expectation. So far, I've tried to split the inner sum into even and odd indices, and see if I can use some alternating series tricks, but I haven't been able to do that quite yet. Since calculating binomial coefficients pretty much amounts to more arithmetic than computers can handle (for moderately large $N, k$ naive evaluation is out of the question).

I also want to use an approximation to find (a lower bound on) $N$ given $k$ and $R$, but at this point I don't know how to do this.

I appreciate any and all help. Thanks

  • 0
    The command for $\ll$ is `\ll`.2012-08-10

1 Answers 1

2

This is way too complicated. You should use linearity of expectation.

The probability for one object to be selected exactly twice is $\binom k2(N-1)^{k-2}/N^k$, so the expected number of pairs is $N$ times that, $\binom k2(N-1)^{k-2}/N^{k-1}$.

  • 0
    Thanks for your reply joriki. I see how you got this answer. Perhaps I was overthinking this problem.2012-08-10