3
$\begingroup$

Let's say we have an unfair dice with the initial (estimated) probability of rolling a specific number = 1/6. How does probability estimation update when getting this specific number on every next experiment? How does probability change when getting other number (on failure)?

Thanks.

2 Answers 2

3

You should not start with a point estimate but with a prior distribution, which we might call $f(p)$ where $p$ is the unknown probability.

For example, suppose you are trying to roll a 6. You might get $x$ successes and $y$ failures. The likelihood of this is proportional to $p^x (1-p)^y$. So your posterior distribution is

$f(p|x,y) = \frac{p^x (1-p)^y f(p)}{\int p^x (1-p)^y f(p) \, dp}$ with the integral taken over the possible range of $p$, which is typically $[0,1]$ for a probability.

If you start with a prior distribution which is $f(p)=\text{Beta}(\alpha,\beta)$ then you will get a posterior distribution of $f(p|x,y)=\text{Beta}(\alpha+x,\beta+y)$: this is called a conjugate prior and assuming it makes the mathematics easier. The expected value of a Beta distribution $\text{Beta}(\alpha,\beta)$ is $\frac{\alpha}{\alpha+\beta}$, so you might decide to start with $\beta = 5 \alpha$, and set $\alpha$ according to how confident you are at the start that $p$ is close to $\frac{1}{6}$. This would make the expectation value of your posterior distribution $\frac{\alpha+x}{\alpha+x+\beta+y}$.

0

The answer provided by Henry highlights some fascinating statistical theory; for another pertinent theoretical link, see wikepedia's article Categorical distribution.

Here we want to proceed pragmatically, but in the end the inquisitive student of statistics should be motivated to study the theoretical underpinnings.

OK, how much deviation do we think is possible? Now $\frac{1}{6} = 0.1666...$ and

$\frac{2}{7} = 0.2857...$
$\frac{3}{13} = 0.2307...$
$\frac{4}{19} = 0.21052...$

Stop! We don't have that much confidence in our die - if a face came up $21$% of the time we would not be surprised.

OK, now imagine throwing a die $18$ times and each side comes up exactly $3$ times. On the $19$ throw whatever comes up would be assigned the statistical frequency of $\frac{4}{19}$.

So there you have the simple idea of how to update the discrete uniform random variable as each data outcome is observed. You make believe the first real observed point is the $19$th reading, the second real reading is the $20$th, etc.; where the first eighteen are imagined to comes up showing an occurrence of $3$ for each of the six sides.

Example: The first (real) row of the die shows a $2$. Updates:

$p(1) = \frac{3}{19}$
$p(2) = \frac{4}{19}$
$p(3) = \frac{3}{19}$
$p(4) = \frac{3}{19}$
$p(5) = \frac{3}{19}$
$p(6) = \frac{3}{19}$