5
$\begingroup$

Consider a candy bag that contains $N=100$ candies. There are only two types of candy in the bag. Say the caramel candy and the chocolate candy. Nothing more is known about the contents of the bag.
Now, you are going to draw (randomly) one candy at a time from the bag until the first caramel appear. Suppose that the first caramel appeared at $k=7$th drawing.

At this moment, what can we say about the number of caramel candies in the bag?

3 Answers 3

5

The question you are asking here is the classical question of inferential statistics: "Given the outcome of an experiment, what can be said about the underlying probability distribution?"

You could, for example, give an estimator for the unknown quantity "number of caramels" (called $a$ from here on). The one most often used (since its easy to calculate) would be the maximum likelyhood estimator, where you estimate $a$ to the the value that maximizes the probability of the outcome.

In this case, you'd choose $a$ to maximize $P_a(7)$ (the probability of drawing the first caramel in the seventh draw, assuming there are $a$ of them). A little Excel calculation, along with Isaac's way to calculate $P_a(7)$ results in $a$ to be estimated as 14.

To judge, what this result is worth, you'd need to calculate the mean squared error of this estimator, which is not as easily done.

If you already had a hypothesis about $a$ (say $a$ < 20), you could use your experimental result to test it, using statistical hypothesis testing, too.

  • 0
    I, too, don't know the exact figure, which is why I put correct in quotes but it should be less than 100/7. I estimated as the exact figures got messy.2010-10-13
1

If the number of caramel candies is $a$, then the probability that the first 6 drawn will not be caramel and the 7th drawn will be caramel (assuming that we do not put back the drawn candies) is $P(\text{7th}|a)=\frac{93!(100-a)(99-a)(98-a)(97-a)(96-a)(95-a)a}{100!}$. Now, given that this has occurred, the probability $P(a|\text{7th})$ of any particular value of $a$ given that the first caramel is the 7th drawn should be $P(\text{7th}|a)$ for that particular $a$ divided by the sum of all possible $P(\text{7th}|a)$. $\sum_{a=0}^{100}P(\text{7th}|a)=\frac{101}{56}$, so $P(a|\text{7th})=\frac{P(\text{7th}|a)}{\frac{101}{56}}=\frac{56\cdot 93!(100-a)(99-a)(98-a)(97-a)(96-a)(95-a)a}{101!}.$

Bashing out some values and adding things up, the probability that $a\le 19$ is slightly less than 50% (49.673%) and the expected value of $a$ is $\frac{65}{3}=21\frac{2}{3}$.


edit: (I've slightly altered my original answer above, mostly in the notation, to better accommodate the work below; I believe that the work above assumed that, without knowing how long it took to draw the first caramel, each possible number of caramels was equally likely.)

Suppose that $P(a)$ is the probability that there are $a$ caramels. As above, for any particular value of $a$, the probability $P(\text{7th}|a)$ that the first caramel drawn is the 7th candy drawn is $P(\text{7th}|a)=\frac{93!(100-a)(99-a)(98-a)(97-a)(96-a)(95-a)a}{100!}.$ So, the probability that there are $a$ caramels and that the first caramel drawn is the 7th candy drawn is $P(a\text{ and 7th})=P(a)\cdot P(\text{7th}|a)$. By Bayes's Theorem: $\begin{align} P(a|\text{7th})&=\frac{P(a\text{ and 7th})}{P(\text{7th})}=\frac{P(a\text{ and 7th})}{\sum_{k=0}^{100}P(k\text{ and 7th})} \\ &=\frac{P(a)P(\text{7th}|a)}{\sum_{k=0}^{100}P(k)P(\text{7th}|k)} \end{align}$

Now, if $P(a)=\frac{1}{100}$ for all $a$, this yields the results in my original answer. If $P(a)={100 \choose a}\frac{1}{2^{100}}$ (a binomial distribution with caramel and not equally likely for each individual candy when the bag is originally filled), the expected value of $a$ is 47.5.

If $P(a)={100\choose a}p^a(1-p)^{100-a}$ (a binomial distribution where the probability of each single candy being caramel is $p$ when the bag is originally filled), the expected value of $a$ is $1+93p$. If this expected value of $a$ given that the first caramel drawn was the 7th candy drawn is to equal the expected value of $a$ without having drawn any candies, which is $100p$, then $p=\frac{1}{7}$, so the expected value of $a$ is $\frac{100}{7}=14\frac{2}{7}$.

  • 0
    @Jens: Fair enough. I've added$a$substantial bit more to my answer to try to address the issue.2010-10-13
0

Since 100 is quite a bit larger than 7 we will assume for the purposes of calculation that the probability of drawing a sweet is not affected by the previous draw (of course it is altered a little, but not too much).

Let the probability of drawing a caramel be $p$ and the probability of drawing a chocolate be $q,$ where $p+q=1.$

Then the expected number of draws to draw a caramel is

$E= p+2pq+3pq^2+4pq^3+ \cdots = \frac{1}{p}.$

Since we drew a caramel at the seventh draw set $7=1/p,$ so $p = 1/7.$ Thus $100/7 \approx 14$ of the sweets are caramel.

If we used the correct figures the probability of drawing a caramel at the next draw would go up with each chocolate drawn, which will push the estimated number of caramels down, perhaps to 13, but I don't expect it will alter much since we started with 100 sweets.

The smaller the number of the first draw of a caramel, the less that can be reliably said. Imagine if you drew a caramel on the first draw, that could imply they are all caramels which could be wildly wrong.