56
$\begingroup$

Suppose a biased coin (probability of head being $p$) was flipped $n$ times. I would like to find the probability that the length of the longest run of heads, say $\ell_n$, exceeds a given number $m$, i.e. $\mathbb{P}(\ell_n > m)$.

It suffices to find the probability that length of any run of heads exceeds $m$. I was trying to approach the problem by fixing a run of $m+1$ heads, and counting the number of such configurations, but did not get anywhere.

It is easy to simulate it:

Distribution of the length of the longest run of head in a sequence of 1000 Bernoulli trials with 60% change of getting a head

I would appreciate any advice on how to analytically solve this problem, i.e. express an answer in terms of a sum or an integral.

Thank you.

  • 2
    This paper may be useful: [_The Longest Run of Heads_, M.K. Schilling](http://www.utktqyz.amc8.org/sites/default/files/pdf/upload_library/22/Polya/07468342.di020742.02p0021g.pdf)2014-07-23

5 Answers 5

58

This problem was solved using generating functions by de Moivre in 1738. The formula you want is $\mathbb{P}(\ell_n \geq m)=\sum_{j=1}^{\lfloor n/m\rfloor} (-1)^{j+1}\left(p+\left({n-jm+1\over j}\right)(1-p)\right){n-jm\choose j-1}p^{jm}(1-p)^{j-1}.$

References

  1. Section 14.1 Problems and Snapshots from the World of Probability by Blom, Holst, and Sandell

  2. Chapter V, Section 3 Introduction to Mathematical Probability by Uspensky

  3. Section 22.6 A History of Probability and Statistics and Their Applications before 1750 by Hald gives solutions by de Moivre (1738), Simpson (1740), Laplace (1812), and Todhunter (1865)


Added: The combinatorial class of all coin toss sequences without a run of $ m $ heads in a row is $\sum_{k\geq 0}(\mbox{seq}_{< m }(H)\,T)^k \,\mbox{seq}_{< m }(H), $ with corresponding counting generating function $H(h,t)={\sum_{0\leq j< m }h^j\over 1-(\sum_{0\leq j< m }h^j)t}={1-h^ m \over 1-h-(1-h^ m )t}.$ We introduce probability by replacing $h$ with $ps$ and $t$ by $qs$, where $q=1-p$: $G(s)={1-p^ m s^ m \over1-s+p^ m s^{ m +1}q}.$ The coefficient of $s^n$ in $G(s)$ is $\mathbb{P}(\ell_n

The function $1/(1-s(1-p^ m s^ m q ))$ can be rewritten as \begin{eqnarray*} \sum_{k\geq 0}s^k(1-p^ m s^ m q )^k &=&\sum_{k\geq 0}\sum_{j\geq 0} {k\choose j} (-p^ m q)^js^{k+j m }\\ %&=&\sum_{j\geq 0}\sum_{k\geq 0} {k\choose j} (-p^ m q )^js^{k+j m }. \end{eqnarray*} The coefficient of $s^n$ in this function is $c(n)=\sum_{j\geq 0}{n-j m \choose j}(-p^ m q)^j$. Therefore the coefficient of $s^n$ in $G(s)$ is $c(n)-p^ m c(n- m ).$ Finally, \begin{eqnarray*} \mathbb{P}(\ell_n\geq m)&=&1-\mathbb{P}(\ell_n

  • 2
    @DaveNeville I have added a derivation of the formula. Hope you find it useful!2014-03-08
12

Define a Markov chain with states $0, 1, \ldots m$ so that with probability $1$ the chain moves from $m$ to $m$ and for $i with probability $p$ the chain moves from $i$ to $i+1$ and with probability $1-p$ the chain moves from $i$ to $0$. If you look at the $n$th power of the transition matrix for this chain you can read off the probability that in $n$ flips you have a sequence of at least $m$ consecutive heads.

  • 0
    There was an error in my comment above. State $(R,r)$ should go, with probability $p$, to $(R, r+1)$ if r, or $(R+1,r+1)$ if r=R, or itself if $r=R=m$.2011-08-25
8

You can find a limiting distribution, otherwise it's a difficult problem and the closed form solution won't have much practical value. See this for an elementary approach. [Update] Previous link moved to this new address. "Longest Run of Heads", M.F.Schilling.

  • 0
    @KaiSikorski updated the link, thanks for the warning.2014-04-14
7

I don't think you'll get a simple analytic formula. The problem is essentially equivalent to this one, see my answer there: it involves the $n$-power of a $m \times m$ stochastic matrix (notice that there we are interested in the runs equal or greater than $m$), using a Markov chain (as suggested in the comments by Shawn). You can find also there some asymptotics.

5

Remark. The following answer uses the same methodology as the accepted leading answer. It is self-contained, includes Maple code and some asymptotics. Perhaps it can serve a didactic purpose and show that the basic computation is very simple.

With $z$ for successes and $w$ for failures we get the generating function

$\left(1+z+\cdots+u\frac{z^m}{1-z}\right) \left(\sum_{q\ge 0} \frac{w^q}{(1-w)^q} \left(z+\cdots+u\frac{z^m}{1-z}\right)^q \right)\frac{1}{1-w}.$

Now as a sanity check when we put $w=z$ and $u=1$ we should get all binary strings. Doing the calculation we find

$\frac{1}{1-z} \frac{1}{1-wz/(1-w)/(1-z)} \frac{1}{1-w} \\ = \frac{1}{(1-w)(1-z)-wz} = \frac{1}{1-w-z}.$

Putting $z=w$ yields

$\frac{1}{(1-z)^2} \frac{1}{1-z^2/(1-z)^2} = \frac{1}{(1-z)^2-z^2} = \frac{1}{1-2z}$

and the sanity check goes through.

Now for the probabilities we must subtract the value for $u=0$ from the generating function and then set $u=1.$ We obtain for the zero term

$\frac{1-z^m}{1-z} \left(\sum_{q\ge 0} \frac{w^q}{(1-w)^q} z^q \frac{(1-z^{m-1})^q}{(1-z)^q}\right) \frac{1}{1-w} \\ = \frac{1-z^m}{1-z} \frac{1}{1-wz(1-z^{m-1})/(1-w)/(1-z)} \frac{1}{1-w} = \frac{1-z^m}{(1-w)(1-z)-wz(1-z^{m-1})} = \frac{1-z^m}{1-w-z+wz^{m}}.$

To obtain the probabilities set $z=pv$ and $w=(1-p)v$ to get for the one term

$\frac{1}{1-w-z} = \frac{1}{1-(1-p)v-pv} = \frac{1}{1-v}$

so that $[v^n] \frac{1}{1-v} = 1.$

The subtracted contribution from the zero term is

$\frac{1- p^m v^m}{1-v+(1-p)p^m v^{m+1}}.$

Now for coefficient extraction we write

$[v^Q] \frac{1}{1-v+(1-p)p^m v^{m+1}} \\ = [v^Q] \sum_{q\ge 0} (v-(1-p)p^m v^{m+1})^q \\ = [v^Q] \sum_{q\ge 0} \sum_{r=0}^q {q\choose r} v^{q-r} (-1)^r (1-p)^r p^{mr} v^{(m+1)r}. \\ = [v^Q] \sum_{q\ge 0} \sum_{r=0}^q {q\choose r} v^{mr+q} (-1)^r (1-p)^r p^{mr} \\ = \sum_{r=0}^{\lfloor Q/m\rfloor} {Q-mr\choose r} (-1)^r (1-p)^r p^{mr}.$

We thus get the closed form

$\bbox[5px,border:2px solid #00A000]{ 1 - a_N + p^m a_{N-m} \quad \text{where} \quad a_Q = \sum_{r=0}^{\lfloor Q/m\rfloor} {Q-mr\choose r} (-1)^r (1-p)^r p^{mr}.}$

Note that for $N\lt m$ we get

$1- {N\choose 0} \times 1 \times 1\times 1 = 0$

which is the correct result. Also for $N=m$ we obtain

$1 - 1 - {0\choose 1} \times -1 \times (1-p) \times p^m + p^m \times 1 = p^m $

which is correct as well.

Now for an asymptotic we note that the root $\rho$ with the smalltest modulus of $1-v+(1-p)p^m v^{m+1}$ is the one close to $v_0=1$ and we get from Newton-Raphson the first approximation

$v_1 = v_0 - \frac{1-v_0 + (1-p)p^m v_0^{m+1}}{-1+(m+1)(1-p)p^m v_0^{m}}.$

This works out to

$1+ \frac{(1-p)p^m}{1-(m+1)(1-p)p^m} = \frac{1-m(1-p)p^m}{1-(m+1)(1-p)p^m} = \rho.$

The corresponding term from the partial fraction decomposition is

$\frac{1}{v-\rho} \mathrm{Res}_{v=\rho} \frac{1}{1-v+(1-p)p^m v^{m+1}} \\ = \frac{1}{v-\rho} \times \left.\frac{1}{-1+(m+1)(1-p)p^m v^{m}}\right|_{v=\rho} \\ = - \frac{1}{\rho} \frac{1}{1-v/\rho} \frac{1}{-1+(m+1)(1-p)p^m \rho^{m}}.$

We thus obtain

$\bbox[5px,border:2px solid #00A000]{ a_Q \sim \frac{1}{\rho^{Q+1}} \frac{1}{1-(m+1)(1-p)p^m \rho^{m}}.}$

We may replace $\rho$ by a better approximation from Newton-Raphson in a setting where we require numerics, which is done in the sample code which now follows.

 MRUNPROB := proc(N, m) option remember; local ind, d, pos, cur, run, runs, prob,     zcnt, wcnt;      prob := 0;      for ind from 2^N to 2*2^N-1 do         d := convert(ind, base, 2);          cur := -1; pos := 1;         run := []; runs := [];           while pos <= N do             if d[pos] <> cur then                 if nops(run) > 0 then                     runs :=                     [op(runs), [run[1], nops(run)]];                 fi;                  cur := d[pos];                 run := [cur];             else                 run := [op(run), cur];             fi;              pos := pos + 1;         od;          runs := [op(runs), [run[1], nops(run)]];          if nops(select(r -> (r[1] = 1 and r[2] >= m), runs)) > 0         then             wcnt := add(`if`(r[1] = 0, r[2], 0), r in runs);             zcnt := add(`if`(r[1] = 1, r[2], 0), r in runs);              prob := prob + p^zcnt * (1-p)^wcnt;         fi;     od;      expand(prob); end;  V1 := proc(N, m) option remember; local gf;      gf := (1-p^m*v^m)/(1-v+(1-p)*p^m*v^(m+1));     expand(1-coeftayl(gf, v=0, N)); end;  a := (Q, m) -> add(binomial(Q-m*r,r)*(-1)^r*(1-p)^r*p^(m*r),     r = 0 .. floor(Q/m));   V2 := proc(N, m) option remember;     expand(1-a(N,m)+p^m*a(N-m,m)); end;   a2 := proc(Q, m) local rho;      rho := (1-m*(1-p)*p^m)/(1-(m+1)*(1-p)*p^m);      1/rho^(Q+1)*1/(1-(m+1)*(1-p)*p^m*rho^m); end;  a3 := proc(Q, m, p) local rho;      rho :=     sort([fsolve(1-v + (1-p)*p^m*v^(m+1), v)],          (r1, r2) -> abs(r1-1) < abs(r2-1));     rho := op(1, rho);       1/rho^(Q+1)*1/(1-(m+1)*(1-p)*p^m*rho^m); end;