1
$\begingroup$

Take the expression $(1-(1-\frac{1}{n})^x)$, which is the expected fraction of elements one samples from a set of $n$ elements, sampling with replacement (and uniform probability) $x$ times.

Provided some $n$, how do we analytically solve for $x$ s.t. $(1-(1-\frac{1}{n})^x) \geq P$? It appears that $\frac{x}{n}$ converges to a fixed value as $n \to \inf$. Is this true? If so, how do we find it without relying on numerical methods?

  • 0
    @Marc von Leeuwen Sorry, I meant that the expression was the fraction of a set of n elements covered after sampling with replacement$x$times.2012-10-23

2 Answers 2

0

We have \begin{align*} 1 - \left(1 - \frac 1n\right)^x \ge P\\ \iff \left(1 - \frac 1n\right)^x \le 1-P\\ \iff x \cdot \log\left(1- \frac 1n\right) \le \log(1-P)\\ \iff x \ge \frac{\log(1-P)}{\log(1 - \frac 1n)} \end{align*} So the smallest such $x$ is $x(n) = \left\lceil\frac{\log(1-P)}{\log(1 - \frac 1n)}\right\rceil$ and $x(n) \to \infty$ as $n \to \infty$ (as $\log$ is continuous and $\log 1 = 0$).

1

The inequality can be rewritten as $\left(1-\frac{1}{n}\right)^x\le 1-P.$ We first solve the equation $\left(1-\frac{1}{n}\right)^t= 1-P.$ Take the (natural) logarithm of both sides. We get $t\log\left(1-\frac{1}{n}\right)=\log(1-P).$ Thus $t=\frac{\log(1-P)}{\log\left(1-\frac{1}{n}\right)}.$ Presumably we want $x$ to be an integer. The smallest integer $x$ for which our inequality holds is $\lceil t\rceil$.

It is clear that $x\to \infty$ as $n\to\infty$. However, something interesting happens if we look at $\frac{x}{n}$. We look instead at the closely related $\frac{t}{n}$, which has the same limit. We have $\frac{t}{n}=\frac{\log(1-P)}{n\log\left(1-\frac{1}{n}\right)}.$ Note that the denominator $n\log\left(1-\frac{1}{n}\right)$ is equal to $\log\left(\left(1-\frac{1}{n}\right)^n\right)$. But $(1-1/n)^n\to e^{-1}$ as $n\to\infty$, so the denominator approaches $-1$. It follows that as $n\to\infty$, $\dfrac{t}{n}$ (and therefore $\dfrac{x}{n}$) approaches $-\log(1-P)$.

  • 0
    Nicolas Sorry, I meant that the expression was the fraction of a set of n elements covered after sampling with replacement x times.2012-10-23