2
$\begingroup$

Suppose, $N$ random digits $$x_1,x_2,\cdots ,x_N$$ with $0\le x_j\le 9$ for $j=1,2,\cdots , N$ and a positibe integer $L$ is given.

What is the probability that a block of $L$ consecutive digits appears at least twice in the above sequence ?

To precise what "appearing at leat twice" means : The two blocks must be seperated. So, "$456456456$" does NOT count as a double occurence of $456456$, and $"111"$ does NOT count as a double occurence of $11$, but "$351351"$ counts as a double occurence of $351$.

Motivation : I want to estimate the length of the longest string occuring at least twice in the first $2\cdot 10^9$ digits of $\pi$ based on the assumption that they behave like random digits.

1 Answers 1

0

Given a string of length $N$, and a positive integer $L$, there can be $N-L+1$ blocks possible. Also, given $L$, the fraction of strings of length $L$ which have consecutive numbers equals $p = \frac{\begin{pmatrix}10 \\ 1\end{pmatrix}}{10^L}$ (choosing the first digit and then fixing next $L-1$ consecutive numbers). This assumes that if the first digit is $9$ then then next $L-1$ consecutive numbers will be $0, 1, ..., (L-1)\mod10$.

Finally,

$P(\text{atleast twice occurence}) \\= 1 - P(\text{zero occurence}) - P(\text{one occurence}) \\= 1 - p ^ {N-L+1} - (N-L+1)p(1-p)^{N-L}$.

P.S. Wrote the answer in a hurry. I'll try to write things properly later.