7
$\begingroup$

Let $\{X_1,\ldots,X_n\}$ be independent identically distributed discrete random variables. I am interested in computing the probability of the event that the sample is duplicate free: $$ \mathbb{P}\left( \bigcap_{i

Special case
If $X_k$ are uniformly distributed with the size of the sample space being $d$, this is a classic birthday problem with the answer: $$ \mathbb{P}\left( \bigcap_{i

Motivation
Consider IEEE floating point number with mantissa $m$ encoded as $d$-tuple of significant binary digits (i.e. the first bit is always 1), and integer binary exponent $e$. For a random real $0

My approach
Applying inclusion-exclusion principle, the complementary probability is $$ \sum_{i

Solutions, ideas, references are welcome.

1 Answers 1

2

The reference solving the birthday problem for arbitrary distribution of birthdays is

Shigeru Mase, "Approximations to the birthday problem with unequal occurrence probabilities and their application to the surname problem in Japan," Ann. Inst. Statist. Math., vol. 44, no. 3, pp. 379-499 (1992).

It uses quite a neat argument involving generating functions. First, note that $$ r_n := \mathbb{P}\left(\bigcap_{1\leqslant icomplete Bell polynomial: $$ r_n = B_n\left(1, -p_2, \ldots, (-1)^{n-1} (n-1)! p_n\right) \tag{1} $$ In particular, assuming $r_0 =1$ the above result implies the recurrence equation for the non-occurrence probabilities $r_n$: $$ r_n = \sum_{k=1}^n (-1)^{k-1} \frac{(n-1)!}{(n-k)!} p_k r_{n-k} \tag{2} $$

Large $n$ asymptotics (with $n \cdot \max\limits_{k \geqslant 1} \pi_k$ being small) is also worked out in the article: $$ r_n \approx \exp\left(-\binom{n}{2} p_2 - \binom{n}{3}\left( 3 p_2^2 - 2 p_3 \right) + \cdots \right) $$