4
$\begingroup$

I have 25000 numbers and I randomly pick one by ony until I get one that I've already picked. I want to know the expected number of picks that need to be made.

The expected value in my opinion should be calculated as 1/25000*1 + (24999/25000)*(2/25000)2 + (24999/25000)(24998/25000)*(3/25000)*3 + ...

Is this formula correct? What would be the solution?

Best, Will

  • 1
    yes, distinct numbers for simplicity assume its 1 - 25000 and yes I sample with replacement, otherwise it would not be possible to get the same value twice.2011-06-13

2 Answers 2

4

Your expression is close, but $1$ less than the true value. For example $\frac{1}{25000}$ is the probability that the first duplicate comes when two numbers have been drawn, not one; and $\frac{24999}{25000} \cdot \frac{24998}{25000} \cdot \frac{3}{25000}$ is the probability that the first duplicate comes when four numbers have been drawn, not three.

This is a variant of the birthday problem, studied by Ramanujan, with a solution for large population $M$ about $ \sqrt{\dfrac{\pi M}{2} } +\dfrac{2}{3} + \sqrt{\dfrac{\pi}{288 M}}$. In this case with $M=25000$ it is about $198.83$.

0

Let $\{0,1,\dots,a\}$ be the interval in which you search for the numbers.

The first number may be chosen arbitrary. For the second number we can say, that there is a probability of $1\over a$ to choose a number already chosen (the number of picks becomes $2$) and a probability of $a-1\over a$ to choose a new number.

In the second case, there are $2$ number already chosen, so the probability of chosing on of them again becomes $2\over a$ and the probability of chosing a new number becomes $a-2\over a.$ Finally, we get an equation like this, where $E$ is your expected value:

$E = \frac1a\times1 + \frac{a-1}a\times a\left(\frac2a\times2+\frac{a-2}a\dots\left(\frac{a-1}a\times(a-1)+\frac1a\times a\right)\dots\right)$

The term of $E$ can be generated using a simple recurrence relation:

$\begin{eqnarray} E\hphantom{_n} &=& E_1\\ E_a &=& a\\ E_n &=& \frac{n^2}a+\frac{a-n}aE_{n+1} \end{eqnarray}$

Using a simple Haskell program I could calculate that $E\approx 197.833.$

  • 0
    Oh, yes. You're right.2011-06-13