If I have a fair die and throw it until I get a run of 1,2,3,4,5,6 in order, how many times on average must I throw the dice?
What is the expected number of dice one needs to roll to get 1,2,3,4,5,6 in order?
-
0@DidierPiau: Of course... though I borrowed the power of Mathematica. Let $\Omega = \{ 0, 1, 12, 123, 1234, 12345, 123456 \}$ be the sample space of possible intermediate states, and let $A$ be the corresponding matrix. Then for the initial state $\mathrm{x}_0 = (1, 0, 0, 0, 0, 0, 0)$, we can calculate the expected trial from $(A + 2A^2 + 3A^3 + \cdots)\mathrm{x}_0 = A (I - A)^{-2} \mathrm{x}_0.$ But inverting a $7\times 7$ matrix is somewhat demanding, so I used Mathematica for this step. – 2012-03-18
2 Answers
Let's write
- $a = $ expected throws to succeed with no useful throws recently
- $b = $ expected throws to succeed with 1 recently
- $c = $ expected throws to succeed with 1,2 recently
- $d = $ expected throws to succeed with 1,2,3 recently
- $e = $ expected throws to succeed with 1,2,3,4 recently
- $f = $ expected throws to succeed with 1,2,3,4,5 recently
- $g = $ expected throws to succeed with 1,2,3,4,5,6 recently
then
- $g = 0$
- $f = 1 + \frac16 g + \frac16 b + \frac46 a$
- $e = 1 + \frac16 f + \frac16 b + \frac46 a$
- $d = 1 + \frac16 e + \frac16 b + \frac46 a$
- $c = 1 + \frac16 d + \frac16 b + \frac46 a$
- $b = 1 + \frac16 c + \frac16 b + \frac46 a$
- $a = 1 + \frac16 b + \frac56 a$
so
- $a = 6 + b$
- $\frac16 b + \frac46 a = 4 + \frac56 b $
- $b = 30 + c$
- $\frac16 b + \frac46 a = 29 + \frac56 c $
- $c = 180 + d$
- $\frac16 b + \frac46 a = 179 + \frac56 d $
- $d = 1080 + e$
- $\frac16 b + \frac46 a = 1079 + \frac56 e $
- $e = 6480 + f$
- $\frac16 b + \frac46 a = 6479 + \frac56 f $
- $f = 38880 + g$
i.e.
- $a = 6 + 30 + 180 + 1080 + 6480 + 38880 + 0 = 46656$
So the answer is $46656$.
This is $6^6$ so the "obvious solution" turns out to be correct and there is probably a quicker way.
-
0@Peter: There are other questions to which it is not the correct solution, but it was the solution put forward in the comments to the question at the time I gave the answer. – 2012-03-19
For each $k\geqslant6$, call $x_k$ the six last positions occupied at time $k$. Introduce $t_0=0$ and, for each $n\geqslant1$, $t_n$ the $n$th time when the word 123456
got completed, that is, $t_n$ denotes the $n$th visit to 123456
of the process $(x_k)_{k\geqslant6}$. The question asks for $\mathrm E(t_1)$.
Note that the sequence $(t_{n+1}-t_{n})_{n\geqslant0}$ is i.i.d. and distributed like $t_1$. This step uses the fact that after a completion of 123456
one must start to build a new occurrence entirely anew, that is, that no nontrivial suffix of 123456
is also a prefix of 123456
. (Otherwise, $(t_{n+1}-t_{n})_{n\geqslant1}$ would be i.i.d. but with a different distribution than $t_1$, more precisely, the distribution of $t_{n+1}-t_n$ for every $n\geqslant1$ would be strictly dominated by the distribution of $t_1$, in particular, for every $n\geqslant1$, one would have $\mathrm E(t_1)\gt\mathrm E(t_{n+1}-t_n)$. In the present case, $\mathrm E(t_1)=\mathrm E(t_{n+1}-t_n)$.)
The process $(x_k)_k$ performs a random walk on the regular directed graph with $N=6^6$ vertices where every vertex 1u
is linked to the six vertices u1
, u2
, ..., to u6
, for every given 5-letters word u
. The out-degree and the in-degree of every vertex being all equal to 6, the stationary distribution of the Markov chain is uniform, in particular the weight of 123456
is $w=1/N$.
By the law of large numbers for ergodic Markov chains, the asymptotic proportion of time spent at 123456
is almost surely $w$, that is, $t_n=(n/w)+o(n)$ almost surely. By the law of large numbers for i.i.d. sequences, $t_n=n\mathrm E(t_1)+o(n)$ almost surely.
Comparing these two expressions, one gets $\mathrm E(t_1)=1/w=N=6^6=46656$.