4
$\begingroup$

I found statement in an article "Good Practice in ( Pseudo ) Random Number Generation for Bioinformatics Applications" that you should not use too many random variables in a single simulation. Authors says that it maximum number of random values taken from PRNG should be $\frac{p}{1000}$ or even better $\frac{1}{200}\sqrt{p}$. $p$ is the period.

But I cannot see any references in other articles.

Do you know any reasons why not to use more values ?

  • 0
    @user1729: that's conceptually a good start. But the output of a PRNG can be much smaller than its internal state -which is related to the period, so it can return consecutive repeated values (think of a PRNG that gives you one bit in each try).2012-05-02

1 Answers 1

4

Assuming $p$ is the period of the PRNG, this is good advice, because after $p$ values are taken the PRNG will repeat.

To avoid the issue, just use a PRNG with a very large period. It will barely take $O(\log p)$ time to extract each pseudorandom bit, so you can make $p$ much larger than the number of values you will ever need to extract.

  • 0
    With MT19937 you have nothing to worry about, since $p = 2^{19937-1}$ and you can't possibly be iterating it anything near $\sqrt{p}$ times. (If you want to be _really_ safe, use [SDLH](http://www.senderek.com/SDLH/).)2012-05-02