1
$\begingroup$

I am sometimes puzzled by the fact that the same result can have two different probabilities according to how the experiment was designed. A typical example is the well known problem of drawing a white marble in 5 draws. Let $p$ be the probability of drawing a white marble. If I decide to draw 5 marbles the probability of drawing only one white is Binomial and is equal to $P_{\text{bin}}(\text{one white}) = 5p(1-p)^4$. If, instead, I decide to draw marbles until I get a white one, the probability of drawing it at the fifth draw is geometric and is equal to $P_{\text{geo}}(\text{one white}) = p(1-p)^4$, hence it is 5 times smaller than the Binomial.

I understand the mathematics behind it and sometimes I convince myself that it is just the fact of possible results considered (in binomial one has to draw 5 marbles, in geometric not, and so on) but other times I am not convinced that the results are consistent.

My question: does anybody know where I can find a scholarly discussion on this topic?

2 Answers 2

0

Funny you should ask that because these issues were just discussed in my statistics class this morning. Let's think about what is happening which will help us perceive the subtle difference between the binomial and the geometric distributions.

Binomial

We have a set $A$ of five elements, for which each $A_i \in \{\text{W}, \text{Q} \}$ where $\text{W}$ denotes a white marble and $\text{Q}$ denotes a marble of any other color.

In the first case, the random variable is the number of marbles in a taking of $n$ marbles. There are a total of $\binom{5}{1}=5$ outcomes, i.e. the white marble can fall in the first, second, third, fourth, or fifth draw. Each one of these five outcomes has probability $(1-p)^4p$ if $p$ is the probability of a marble being drawn. Thus, we can represent the total probability of there being a white marble as the sum of the probabilities of each outcome for which there is exactly one white marble in a group of five marbles.

$$P(\text{one white marble in five}) = \sum^{5 \text{ outcomes}}{(1-p)^4p} = 5(1-p)^4p$$

Geometric

Now let's analyze the second case you describe which is subtly different. Here, we are not interested in knowing the probability that, given $n$ trials I will find a white marble. Instead, the random variable $X$ is the number of trials it will take until a white marble is found. More abstractly, we are measuring the probability there will be no white marbles in the first $x-1$ picks and then exactly one white marble in the $x$th pick. The probability of there being no white marbles in the first $x-1$ picks is $(1-p)^{x-1}$ and the probability of there being a white marble in the $x$ pick is $p$. Thus

$$P(X=x) = (1-p)^{x-1}p$$

What is the difference?

The random variables are simply different. We are answering two different questions.

The Punch Line

We know the probability of the first white marble appearing in $n$ trials. We also know the probability of observing $1$ white marble in $n$ trials.

It is five times less likely the first white marble will appear in the 5th trial than it will appear in any of the first to the fifth trials. This makes sense.

The white marble can appear in any of our five picks. The binomial distribution cares about the likelyhood the white marble will appear in ANY of the five picks, i.e. any of the configurations in which one white marble can appear in a set of five marbles. The geometric distribution cares about the likelihood the white marble appears in ONLY ONE of the five configurations that the binomial distributions accounts for, namely, the last one.

I hope this made it a little clearer to you.

  • 0
    Thank you, As I said, I understand this. I was thinking about the fact that the experiment and result are identical and to an external observer it would look quite wrong. The only difference is the stopping rule adopted,2017-02-08
  • 0
    hence see the bottom of my answer2017-02-08
0

It's not how the experiment is designed.   Those are simply two different events in the same process.

The process is of drawing marbles from an urn with replacement and no bias (and stirring between draws too), with an identical and independent rate of drawing white marbles, $p$, each draw, for an indefinite amount of draws.

The geometric event is that the fifth draw is the first white marble.

The binomial event is that only one white marble will be among the first five draws.

These are just not the same thing.   The geometric event covers one fifth as many equally probable outcomes as the binomial event; it is a subset. $$\begin{align} &\bullet\bullet\bullet\bullet\circ\cdots \gets\\&\bullet\bullet\bullet\circ\bullet\cdots\\&\bullet\bullet\circ\bullet\bullet\cdots\\ &\bullet\circ\bullet\bullet\bullet\cdots \\&\circ\bullet\bullet\bullet\bullet\cdots \end{align}$$

That is all.

  • 0
    Thanks Graham, see my comment above.2017-02-08