1
$\begingroup$

I got a problem which is (I think) similar to the monkey typing problem:

Suppose we flip a coin $n$ times to obtain a sequence of flips $X_1, X_2, \dots, X_n$. A streak of flips is a consecutive sub-sequence of flips that are all the same. E.g., if $X_3, X_4$ and $X_5$ are all heads, there is a streak of length 3 at the third flip (If $X_6$ is also head, then there is ALSO a streak of length 4 starting at the third flip). Show that, for sufficiently large $n$, the probability that there is no streak of length at least $\lfloor \log_2{n}-2\log_2{\log_2{n}} \rfloor$ is less than $1/n$.

Here's my thought:

We can divide $n$ flips into $\frac{n}{\log_2{n}-2\log_2{\log_2{n}}}$ blocks (I ignore floor function for simplicity). For each of this block, the probability that we obtain a streak of length $(\log_2{n}-2\log_2{\log_2{n}})$ is

$2*(\frac{1}{2})^{\log_2{n}-2\log_2{\log_2{n}}}$

here, I multiply 2 because (I think) both consecutive heads or tails are considered to be streak.

This expression can be simplified to

$2*[\frac{1}{n}*(\log_2{n})^2]$

Hence, the probability that none of these blocks obtain a streak we want is (each block is independent with each other)

$(1-\frac{2}{n}(\log_2{n})^2)^{\frac{n}{{\log_2{n}-2\log_2{\log_2{n}}}}} \approx e^{-\frac{2(\log_2{n})^2}{log_2{n}-2log_2{log_2{n}}}} \le e^{-\frac{2(\log_2{n})^2}{log_2{n}}} = e^{-2\log_2{n}} =\Theta(\frac{1}{n^2})$

My analysis is actually quite loose as there are more possible blocks, but these additional blocks are not independent so I just ignore them. Now, here comes my problem, now that I'm able to show (even in a loose analysis) that the probability the problem asked is less than $1/n^2$, why the problem bother ask us to reason about a answer of $1/n$. Did I miscalculate anything?

1 Answers 1

2

Fix some $L$ and $n$, and consider a sequence of length $n$. Call $A$ the event that no block of length $L$ is a streak, $A_k$ the event that no block of length $L$ which starts at a position equal to $k\pmod{L}$ is a streak, and $B_k^i$ the event that the sequence from $k+iL$ to $k+(i+1)L-1$ is not a streak.

Then $A$ is the intersection over $k$ in $\mathbb Z_L$ of the events $A_k$, each $A_k$ is the intersection over $i\leqslant n/L$ of the events $B_k^i$ and, for every fixed $k$ in $\mathbb Z_L$, the events $(B_k^i)_i$ are independent and equiprobable with common probability $1-2\cdot2^{-L}$.

This yields $ \mathrm P(A_k)=\mathrm P\left(\bigcap\limits_{i\leqslant n/L}B_k^i\right)=\prod\limits_{i\leqslant n/L}\mathrm P\left(B_k^i\right)=\mathrm P(B_k^1)^{n/L}=(1-2\cdot2^{-L})^{n/L}, $ for every $k$ in $\mathbb Z_L$, and $ \mathrm P(A)=\mathrm P\left(\bigcap\limits_{k\in\mathbb Z_L}A_k\right)\leqslant\mathrm P(A_1). $ If $L=L(n)$ with $ L(n)=\log_2(n)-2\log_2\log_2(n)+o(\log_2\log_2(n)), $ then $ \log\mathrm P(A_1)\sim2n/(L2^L)\sim-2\log(n), $ hence $\log\mathrm P(A_1)\leqslant-\log(n)$ for $n$ large enough, which implies $\mathrm P(A)\leqslant \mathrm P(A_1)\leqslant1/n$.

In the end, this upper bound can be strengthened to $\mathrm P(A)\leqslant1/n^{u}$ for every $n$ large enough, for every $u<2$, and these upper bound $1/n$ rely only on the streaks of length at least $L$ which begin at a position equal to $1\pmod{L}$.