I have a set of $N$ elements and I iterate the following procedure (all the elements at the beginning are not marked):
- I pick an element at random.
- If it is marked, I stop.
- If it is not marked, I mark it and I repeat the procedure.
How many times on average do I iterate the procedure before stopping because I am picking something I have already picked?
My best shot at solving it
The probability of picking one more, having already picked one is
\begin{equation*} p_i = \frac{N - i}{N} \end{equation*}
The probability of picking at least $i$ is therefore:
\begin{equation*} s_i = \frac{N!}{(N - i)! N^i} \end{equation*}
The probability of picking exactly $i$ is
\begin{equation*} f_i = s_i - s_{i + 1} = \frac{N!}{(N - i)!N^i}\frac{i}{N} \end{equation*}
So far so good. Now, I use Stirling's approximation ($N$ is very large and the probability to have a large $i$ is very small) and (if I didnt' get any calculation wrong) I get
\begin{equation*} f_i = \frac{i}{N} \left(\frac{N}{N - i}\right)^{(N - i + 1/2)} \end{equation*}
The average number of picks $$ is therefore:
\begin{equation*} = \sum_j^N \frac{j^2}{N} e^{-j} \left(\frac{N}{N - j}\right)^{(N - j + 1/2)} \end{equation*}
But here I have no clue on how to go on. I can imagine this is a fairly known problem, but I wouldn't know where to look for, does it have a commonly known name?