1
$\begingroup$

Consider a 12-sided fair die. What is the distribution of the number T of rolls required to roll a 1, a 2, a 3, and a 4?

Taking inspiration from the Coupon Collector's Problem, I believe that the expected number of rolls to achieve the goal would be

$E[T] = \sum\limits_{i=0}^3 \frac{12}{4-i} = 25$

Similarly, the variance would be

$Var[T] = \sum\limits_{i=0}^3 \frac{1-\frac{4-i}{12}}{(\frac{4-i}{12})^2} = 180$

But applying Chebyshev here does not yield very useful bounds. My question is therefore, how would you compute, for example, $P(T=16)$ or $P(T<30)$?

Ideally this could be generalized to a set of k required numbers, not just 4 as in the example.

3 Answers 3

1

Personally I would call $P(n,b,k,j)$ the probability that after rolling $n$ dice with $b$ sides, $j$ of the target $k$ sides had been found, with the formula

$P(n+1,b,k,j)= \frac{b-k+j}{b} P(n,b,k,j) + \frac{k+1-j}{b} P(n,b,k,j-1)$

starting from $P(0,b,k,j)=0$ for $j \not = 0$ and $P(0,b,k,0)=1$.

Then $\Pr(T \le t) = P(t,b,k,k)$ and $\Pr(T = t) = P(t,b,k,k)-P(t-1,b,k,k).$

So in your example it is not difficult to calculate $\Pr(T = 16) \approx 0.0380722$ and $\Pr(T \le 30) \approx 0.7305294$. Your expected value of $T$ and variance appear to be correct.

  • 0
    @joriki: Fair enough, I get $\Pr(T \lt 30)= P(29,12,4,4)\approx 0.708618$ as you do2016-05-18
1

You can get this distribution by conditioning on the number $N$ of rolls up to $4$ required:

\begin{align} \textsf{Pr}(T\le t) &=\left(\frac23\right)^t\sum_{n=0}^t\binom tn2^{-n}\textsf{Pr}(N\le n) \\ &=\left(\frac23\right)^t\sum_{n=0}^t\binom tn2^{-n}\frac{4!}{4^n}\left\{n\atop4\right\} \\ &=\left(\frac23\right)^t\sum_{n=0}^t\binom tn8^{-n}\sum_{j=0}^4(-1)^j\binom4jj^n \\ &=\left(\frac23\right)^t\sum_{j=0}^4(-1)^j\binom4j\sum_{n=0}^t\binom tn\left(\frac j8\right)^n \\ &=\left(\frac23\right)^t\sum_{j=0}^4(-1)^j\binom4j\left(1+\frac j8\right)^t \\ &=\sum_{j=0}^4(-1)^j\binom4j\left(\frac23+\frac j{12}\right)^t \\ &=1-4\left(\frac{11}{12}\right)^t+6\left(\frac56\right)^t-4\left(\frac34\right)^t+\left(\frac23\right)^t\;, \end{align}

where $\left\{n\atop4\right\}$ is a Stirling number of the second kind and the distribution $P(N\le n)$ is given at Probability distribution in the coupon collector's problem.

The expectation comes out right as

\begin{align} E[T]&=\sum_{t=0}^\infty\textsf{Pr}(T\gt t)\\ &=\sum_{t=0}^\infty\left(4\left(\frac{11}{12}\right)^t-6\left(\frac56\right)^t+4\left(\frac34\right)^t-\left(\frac23\right)^t\right)\\ &=12\left(4\cdot\frac11-6\cdot\frac12+4\cdot\frac13-1\cdot\frac14\right)\\ &=25\;. \end{align}

Here are plots of the cumulative distribution function and the probability mass function.

In your examples,

$ \textsf{Pr}(T=16)=\textsf{Pr}(T\gt15)-\textsf{Pr}(T\gt16)=\frac{293289532268461}{7703510787293184}\approx3.8\% $

and

$ \textsf{Pr}(T\lt30)=\textsf{Pr}(T\le29)=\frac{292029927548835623394780280045}{412111655902378135987760922624}\approx70.9\%\;. $

For general $k$ instead of $4$ numbers required and general $m$ instead of $12$ numbers on the die, the result is

$ \textsf{Pr}(T\le t)=\sum_{j=0}^k(-1)^j\binom kj\left(1-\frac{k-j}m\right)^t\;. $

0

I think it's the sum of four different geometric distributions.

For example, you start off and you haven't rolled any of the numbers. The probability of rolling one of them is 4/12 so you start a geometric distribution with that parameter. Then you suddenly roll one of them. Now you need to roll one of the other three, each roll has a probability of 3/12 of getting one of them, so it's another geometric distribution.

By this logic I think T ~ Geo(4/12) + Geo(3/12) + Geo(2/12) + Geo(1/12) which of course is probably some hideous ditribution define in terms of convolutions (or you know, could miraculously be nice and of closed-form, I don't know what.)

  • 0
    This is sort-of what I was thinking, but what are the next steps to get to the point when you can actually compute the probability densities?2012-12-27