1
$\begingroup$

Given that a dice is tossed $N$ times, how much is the ratio between the most occurred number and the less occurred number on average (given that I repeat this N tossing a very large number of times) ?

This ratio tends to 1 as the number of throws N tends to infinity.

But given N how can I get the average ratio of most occurred / least occurred?

Here is an experimental graph of this relation (the tossing of N dice was not repeated hence the "wobbly" nature of the grapph):

Graph

Code:

import random

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.animation as animation

x = [ 1 for _ in range(6) ]

def update_results(results):
    die_toss = random.randint(0, 5)
    return [x + (1 if i == die_toss else 0) for i, x in enumerate(results)]+

def plot_points(ps):
    plt.scatter(*zip(*ps))

ps = []
for i in range(1,10000):
    x = update_results(x)
    ps.append( (i, max(x) / float(min(x)) ) )
    #ps.append( (i, max(x) / (i*(1./6)) ) )

plot_points(ps)
plt.xlabel('Number of dice throws')
plt.ylabel('Most occured / Least Occured Ratio')
plt.show()
  • 0
    how did u generate this graph?2017-01-24
  • 0
    Won't the distribution depend on your pseudo-random generator?2017-01-24
  • 0
    @Abhijith The pseudo random generator is assumed to be perfectly random for this purpose2017-01-24
  • 1
    @Kiran code added2017-01-24

1 Answers 1

2

This is revealed to be more difficult than it looks but we can at least provide an algorithm to compute the exact values rather than rely on simulations.

Suppose that the die has $m$ faces and is rolled $n$ times. Rolling the die with the least occured value being $p$ and the most occured value being $q$ and both of them marked yields the species

$$\mathfrak{S}_{=m} (\mathfrak{P}_{=0}(\mathcal{Z}) + \mathcal{U}\mathfrak{P}_{=p}(\mathcal{Z}) + \mathfrak{P}_{=p+1}(\mathcal{Z}) + \cdots + \mathfrak{P}_{=q-1}(\mathcal{Z}) + \mathcal{V}\mathfrak{P}_{=q}(\mathcal{Z})).$$

This has generating function

$$G(z,u,v) = \left(1+u\frac{z^p}{p!} + \sum_{r=p+1}^{q-1} \frac{z^r}{r!} + v\frac{z^q}{q!}\right)^m.$$

Subtracting the values where sets of size $p$ and $q$ did not occur we obtain the generating function

$$H_{p,q}(z) = \left(1+ \sum_{r=p}^{q} \frac{z^r}{r!}\right)^m \\ - \left(1+ \sum_{r=p+1}^{q} \frac{z^r}{r!}\right)^m - \left(1+ \sum_{r=p}^{q-1} \frac{z^r}{r!}\right)^m + \left(1+ \sum_{r=p+1}^{q-1} \frac{z^r}{r!}\right)^m.$$

We then obtain for the desired quantity the closed form

$$\bbox[5px,border:2px solid #00A000]{ \frac{n!}{m^n} [z^n] \sum_{p=1}^n \sum_{q=p}^n \frac{q}{p} H_{p,q}(z).}$$

Introducing

$$L_{p,q}(z) = \left(1+ \sum_{r=p}^{q} \frac{z^r}{r!}\right)^m$$

we thus have

$$\frac{n!}{m^n} [z^n] \sum_{p=1}^n \sum_{q=p}^n \frac{q}{p} (L_{p,q}(z) -L_{p+1,q}(z)-L_{p,q-1}(z)+L_{p+1, q-1}(z)).$$

This is

$$\frac{n!}{m^n} [z^n] \sum_{p=1}^n \sum_{q=p}^n L_{p,q}(z)\left(\frac{q}{p} -\frac{q}{p-1}-\frac{q+1}{p}+\frac{q+1}{p-1}\right)^*$$

where the star indicates that those terms with $p-1=0$ and $q+1=n+1$ do not contribute. We also have for $p\lt q$

$$[z^n] L_{p,q}(z) = [z^n] \sum_{k=0}^m {m\choose k} \left(1+ \sum_{r=p+1}^{q} \frac{z^r}{r!}\right)^{m-k} \left(\frac{z^p}{p!}\right)^k \\ = \sum_{k=0}^m {m\choose k} [z^{n-pk}] \frac{1}{p!^k} \left(1+ \sum_{r=p+1}^{q} \frac{z^r}{r!}\right)^{m-k}.$$

Furthermore we obtain for $p=q$

$$[z^n] L_{q,q}(z) = [z^n] \left(1+ \frac{z^q}{q!}\right)^m = [[n \bmod q\equiv 0]] \times {m\choose n/q} \frac{1}{q!^{n/q}}.$$

With these we can implement a recursion, which in fact is not all that much faster than working with $H_{p,q}(z).$ This yields the following graph where we have used a six-sided die. We reach resource limits fairly quickly (esp. space) but we do have the exact values for up to $120$ rolls, which is completely impossible by enumeration ($94$ digits total case count). The convergence is very slow which tells us that we must deploy probabilistic methods to make progress with this problem.

4.5+
   |              HH
   +             H  HH
   +            H     H
   +            H      H
   |           H        H
   +                     H
  4+          H          H
   |                      H
   +         H             H
   +                        H
   +         H              H
   |                         H
   +                          H
3.5+        H                  H
   +                           HH
   |                             H
   +       H                      H
   +                              H
   +                               HH
   |      H                          H
  3+                                 HH
   +                                   HH
   +      H                             HH
   |                                      HH
   +                                       HH
   +     H                                   HH
2.5+                                          HHH
   |                                             HH
   +                                               HHH
   +    H                                             HHH
   |                                                     HHHH
   +                                                         HHHH
   +   H                                                         HHHHHH
  2+                                                                  HHHHHHH
   |                                                                        HHHHHHHH
   +                                                                                HHHHHHHHHH
   +                                                                                          H
   +  H
   |
   +
1.5+
   +  H
   |
   +
   +
   +
   |
  1+HH
  -+--+---+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--
   0              20             40             60             80             100            120

This was the Maple code.

with(combinat);

ENUM :=
proc(n, m)
option remember;
local rolls, res, ind, counts;
    res := 0;

    for ind from m^n to 2*m^n-1 do
        rolls := convert(ind, base, m);

        counts := map(mel->op(2, mel), 
                      convert(rolls[1..n], `multiset`));

        res := res + max(counts)/min(counts);
    od;

    res/m^n;
end;

L := (m, rmin, rmax) -> (1 + add(z^r/r!, r=rmin..rmax))^m;

X :=
proc(n, m)
    option remember;
    local H;

    H := (p, q) ->
    expand(L(m, p,q)
           - L(m, p+1,q) - L(m, p,q-1)
           + L(m, p+1,q-1));

    n!/m^n*
    coeff(add(add(q/p*H(p,q), q=p..n), p=1..n),
         z, n);
end;

LCF :=
proc(n,m,p,q)
    option remember;

    if n < 0 then return 0 fi;

    if p = q then
        if n mod q <> 0 then return 0 fi;
        return binomial(m,n/q)/q!^(n/q);
    fi;

    add(binomial(m,k)*1/p!^k*LCF(n-p*k, m-k, p+1, q),
        k=0..m);
end;

LVERIF :=
(m, p,q)  -> add(LCF(n, m, p, q)*z^n, n=0..q*m);

XX :=
proc(n, m)
    option remember;
    local res, p, q, cf;

    res := 0;

    for p to n do
        for q from p to n do
            cf := q/p
            - `if`(p>1, q/(p-1), 0)
            - `if`(q1 and q

Addendum. A rather fascinating sequence appears when we compute the value $n$ of the number of rolls of a die with $m$ sides that maximizes the average ratio between most and least. We obtain a sequence that might well be linear, or it might not, here it is:

$$1, 5, 9, 13, 16, 20, 24, 28, 33, 37, 41, 46, 50, \\ 55, 60, 64, 69, 74, \ldots $$

A linear fit to this sequence is given by

$$- 4.82352941176471806+ 4.27966976264189913\,m.$$

This pattern does seem to suggest the problem merits additional investigation. I hope these data and the conjecture as to suspected linearity will be a start.

MX :=
proc(m)
    option remember;
    local n, cur, nxt;

    if m = 1 then return 1 fi;

    n := 1;
    cur := XX(1, m);

    do
        nxt := XX(n+1, m);

        if cur > nxt then
            break;
        fi;

        n := n+1;
        cur := nxt;
    od;

    n;
end;

Remark. There is a better recurrence at the following MSE link.

  • 1
    Very nice! (+1)2017-01-27