Given a randomness source $D$ with min-entropy $k$, we should expect that we will have to take more than $2^{k/2}$ samples before we see a repetition (claim from http://people.csail.mit.edu/madhu/FT04/scribe/lect20.pdf). The exact quote is:
So if a source has min-entropy $k$, the probability that the source gives a particular string is at most $2^{-k}$ . This denition is fairly general. If a source has min-entropy $k$, then we expect that we will have to take more than $2^{k/2}$ samples from the source before we see a repetition. Such a source must also have a support that contains more than $2^k$ elements.
I've been trying to prove this and here is my try:
Define a random variable $M$ which is the number of samples takes before a repetition is observed. The PMF of $M=m$ should be the probability of observing a sequence of $m$ strings which neither of which are equal to the others. Since the min-entropy is $k$, then no string has probability of appearing more than $2^{-k}$.
Assuming all strings have probability $p=2^{-k}$ (I am not sure of whether this assumption is a correct strategy but I don't know what else to do). So in our sequence of length $m$ we have the first string has probability $p$, the second one has probability $1-p$, the third one have probability $1-p(1-p)$ (assuming independence!), and so forth.
So the PMF should the product of all these. This is where I am having the difficulty of writing a general function for the PMF $f_M$, in order to use it afterwards to calculate the expectation of $M$, which would be $E[M]=\sum\limits_{m=1}^\infty m f_M(m) $.
I am thinking on the right track or I am so going in the wrong direction ?