4
$\begingroup$

Consider the following algorithm:

function Rand():     return a uniformly random real between 0.0 and 1.0  function Sieve(n):      assert(n >= 2)      for i = 2 to n         X[i] = true      for i = 2 to n         if (X[i])             for j = i+1 to n                 if (Rand() < 1/i)                     X[j] = false      return X[n] 

What is the probability that Sieve(k) returns true as a function of k ? What is the limit of this probability as k goes to infinity (if it has one) ?

  • 0
    There is an empirical estimate in the answer on the link, and also a DP solution. I'm after a closed form and limit.2012-07-16

1 Answers 1

5

This is difficult to do exactly because the probabilities are all correlated: a small number of small "primes" will lead to more big "primes", and vice versa. The answer at stackoverflow ignores these correlations, yet gets quite close to the value for $k=1000$ ($0.1227$ instead of $0.1199\pm0.0003$ as determined by computer simulation), so it seems it's a reasonable approximation to model the probabilities as independent.

Doing so actually yields a much simpler recursion than the one used in the stackoverflow answer: $k+1$ has exactly the same chances of getting marked "composite" as $k$, and if $k$ is "prime" it has an additional chance during the $k$-th iteration, with probability $1/k$. Thus the probabilities $p_k$ of $k$ being "prime" satisfy the recursion relation

$p_{k+1}=p_k\left(1-\frac{p_k}k\right)\;.$

We can rewrite this as

$p_{k+1}-p_k=-\frac{p_k^2}k$

and approximate it by a differential equation:

$p'=-\frac{p^2}k\;,$

whose general solution is

$p=\frac1{c + \log k}$

with an arbitrary constant $c$, which is influenced by the approximation errors for small $k$. From $p_{1000}\approx0.12$ we get $c\approx1.43$, so a good approximation is given by

$p_k\approx\frac1{1.43+\log k}\;.$

Here's a plot of the values up to $p_{1000}$, which shows rather satisfactory agreement (red: model, green: simulations):

plot showing agreeement between model and simulations

Thus, the probabilistic sieve asymptotically behaves like the actual prime sieve.