11
$\begingroup$

This question was asked on a forum and while many argued that the answer is 14 (since the probability of you seeing bus 14 is maximum in this case), I argued against it that they were working backwards. My claim is that this question is invalid as there is no method to determine the probability of number of bus routes.

I'm looking for clarification as to the right answer (with proof obviously)

  • 1
    As a school prank you release 3 greased pigs into the school with the numbers 1,2, and 4 painted in bold letters on their backs. What is the most likely number of pigs from the point of view of the school administrator?2011-07-06

2 Answers 2

20

Your question touches on fundamental issues of the interpretation of probabilities. You won't get a proof, at least not in the mathematical sense, since this is not a mathematical question but an interpretational question.

Basically, there are two popular interpretations of probability theory, the frequentist one and the Bayesian one. In the frequentist interpretation, probabilities specify the relative frequencies of an event if you perform the "same" experiment many times. Obviously you can't create a large number of moons and count the number of bus routes on each, or, as you put it, "there is no method to determine the probability of number of bus routes", and thus in this interpretation there is no such thing as that probability. As William Feller put it:

There is no place in our system for speculations concerning the probability that the sun will rise tomorrow.

In the Bayesian interpretation, probability theory allows us to reason about uncertain events, and more specifically to rationally update our assessments of how likely events are when new information comes in. In this framework, you always need some prior assessment of likelihoods, and then the theory tells you how to adjust that using the data you observe.

For some sorts of events, such as rolling a die, there are rational grounds for choosing prior probabilities (the same probability for each number). In other cases, such as bus routes on the moon, there isn't one obvious set of prior probabilities, but still reasoning about how prior assessments of likelihood should rationally be modified by incoming data can be useful.

In the present case, any prior assessment of the likelihood of various numbers of bus routes on the moon would presumably have exhibited a very dominant spike at $0$ and then a very low and rather flat tail for all other numbers. The Bayesian probability update requires us to multiply the a priori probability for each possible number of bus routes by the conditional probability that you would have observed a route $14$ bus if there were that many bus routes, and then normalize the resulting probabilities to $1$ to obtain the a posteriori probabilities. If we follow the assumption you seem to be making in the question, that the bus routes are numbered sequentially beginning with $1$ and we have an equal probability of encountering a bus from any one of the existing bus routes, then the conditional probability of observing a route $14$ bus given $n$ bus routes is zero for $0$ to $13$ bus routes and $1/n$ for $n$ bus routes if $n\ge14$.

Now although there is no way to agree on any particular prior in the present case, it seems rather plausible that apart from the dominant spike at $0$, the prior would have been relatively flat. That is, there was no strong a priori reason to favour, say, the number $15$ over the number $14$, and so the ratio of the prior probability for $15$ bus routes to the prior probability for $14$ bus routes would not have exceeded $15/14$. That's the ratio by which the observation of a route $14$ bus raises the probability for $14$ bus routes relative to the probability for $15$ bus routes. So although there's no proof and no one right answer, we can nevertheless plausibly argue that most sensible priors would not favour any number $n>14$ over the number $14$ by a factor of $n/14$, and thus, in a Bayesian framework, the number of bus routes with the highest a posteriori probability would be $14$.

  • 0
    @joriki: Wikipedia suggests that there is a valid (and successful) frequentist interpretation of the [German tank problem](http://en.wikipedia.org/wiki/German_tank_problem) mentioned by ShreevatsaR's comment, which also I see this question as a special case of. Can you comment on this apparent discrepancy?2011-07-06
-5

I would have thought that on average your best guess is 28. I think the question does have an answer. Maybe it depends on the spread of which amounts of buses could be used (probably exponential - so the same likelihood in between 10 and 20 as between 100 and 200). But actually, I don't think that matters.

I think it's easy to overcomplicate the question. You don't question how someone flips a coin or how many sides a coin has or what surface it's landing on. Just because they mentioned the Moon doesn't mean the whole thing is completely hypothetical and ridiculous. Like is an even number of buses more likely?!

The game is this: I pick a thousand billion secret numbers. For each number N, I have to give you (the guesser) a random number between 1 and N. Then you have to guess N. What's the best strategy for you to use? We can measure your success based on number of correct guesses or based on average closeness.

An interesting question would be whether it matters what secret numbers I pick (what spread, etc)? If you use my strategy of doubling, I don't think it does. Anyone want to play?

  • 0
    Please use answers to answer questions *only*. Notice that in this case, what you wrote does *not*( answer the OP's question.2011-09-05