3
$\begingroup$

I tried testing random integers for compliance with Benford's law, which they are apparently supposed to do. However, when I try doing this with Python,

map(lambda x:str(x)[0], [random.randint(0, 10000) for a in range(100000)]).count('1')

I get approximately equal frequencies for all leading digits. Why is this the case? Might it have something to do with how the pseudorandom number generator, the Mersenne twister, works?

  • 0
    Therefore, this is the distribution expected if the mantissae of the logarithms of the numbers (but not the numbers themselves) are uniformly and randomly distributed. (Wikipedia)2012-10-21

5 Answers 5

1

Thank you for the answers, but I think I found a more explicit source of the error by reading the paper more carefully.

Benford's law does apply to random integers, but only as the upper and lower bounds go to infinity. The limit $\lim_{N\to \infty} P_N^1(1)$ (the proportion of integers from 1 to $N$ which have a leading digit of 1, as $N$ goes to infinity) diverges, and equals 1/9 at every $10^n, n\geq 2$, which explains my result. If I set the upper bound on the random integers to 12000, for example, I get different results.

  • 0
    Yes, I meant as the lower bound goes to infinity. Thank you!2012-10-21
6

The linked paper is titled unfortunately, at least as regards the current conception of the word "random." The whole point of Benford's law is precisely that it doesn't hold when integers are drawn uniformly from a range that, like yours, ends at a power of $10$: a well-designed pseudorandom number generator should give numbers with asymptotically exactly a $\frac{1}{10}$ chance of each leading digit $0,1,...,9$ in decimal notation.

Benford's law applies not to properly random sources of data, but to data produced by certain real-life random processes. One simple characterization of data that obey Benford's law is as those produced by processes that exhibit more-or-less exponential growth, such as populations. Such processes produce uniformly distributed logarithms of data points, but this uniform distribution gets splayed out into the characteristic Benford's law upon exponentiation.

  • 0
    Yep, that sounds right.2016-08-30
4

Benford's law applies only to distributions that are scale-invariant and thus applies approximately to many real-life data sources, especially when we measure with arbitrary units: If the leading-digit distribution of a sample is essentially the same whether we measure in inches or centimeters, this is only possible if the logarithm is equidistributed (or approximately so over a range wide enough to cover several orders of magnitudes).

2

The simplest solution to approach the logarithmical distribution using mersenne twister is by using two dimensions of randomness. First the bandwidth for the amount of digits, say 15, secondly the bandwidth, say 10-100000. With more than 10,000 iterations the results shows an almost exact benford distribution.

Hope this helps.

Regards,

Dennes Spek