1
$\begingroup$

I have a set of N random integers between A and B.

Assuming that my random number generator is equally likely to return any integer between A and B, how can I calculate the probability that the next random integer is already present in my set?

I want to estimate how many random numbers I should generate in a batch such that I can say with probability P that atleast one of the new integers does not already exist in the set.

Thanks

2 Answers 2

1

Assuming the $N$ integers you have already are known to be distinct, there are $B-A+1-N$ left to pick (assuming $A$ and $B$ are in your range). The chance the next one will be different is then $\frac {B-A+1-N}{B-A+1}$

  • 0
    Thanks. I forgot to ensure that the N integers were unique in my "experiment" code so I was getting really low collision rates and couldn't figure out what was going on.2012-10-02
1

Assuming $N$ numbers were drawn independently and without replacement, the problem can be rephrased as follows:

Suppose $X_1, X_2, \ldots, X_n, X_{n+1}$ are independent identically distributed discrete random variables, uniformly distributed on $[a,b]$. You seek to determine the probability: $\begin{eqnarray} \mathbb{P}\left(X_{n+1} \not=X_1,X_{n+1} \not=X_2,\ldots,X_{n+1} \not=X_n\right) &=& \mathbb{E}\left(\mathbb{P}\left(X_{n+1} \not=X_1,\ldots,X_{n+1} \not=X_n |X_{n+1}\right) \right) \\ &\stackrel{\text{indep.}}{=}& \mathbb{E}\left( \mathbb{P}\left(X_{n+1} \not= X_n \right)^n\right) \\ &=& \mathbb{E}\left( \left(\frac{b-a}{b-a+1} \right)^n\right) = \left(\frac{b-a}{b-a+1} \right)^n \end{eqnarray} $

Recasting the answer using capitalized variables in the question: $ p = \left(\frac{B-A}{B-A+1} \right)^N $