3
$\begingroup$

This sounds like the kind of etherial question that generally gets dropped from stack exchange sites, but I don't know of a better venue to ask so I'm hoping this question will help other folks with a similar dilemma.

I recently posted this question: Probability of selecting a combination of two variables.

I have a bag of toys. 10% of the toys are balls. 10% of the toys are blue.

If I draw one toy at random, what're the odds I'll draw a blue ball?

One person provided an answer immediately and others suggested that more details were required before an answer could even be considered. But, there was a reason I asked this question the way that I did.

I was thinking about probabilities and I was coming up with a way to ask a more complicated question on math.stackexchange.com. I needed a basic example so I came up with the toys problem I posted here.

I wanted to run it by a friend of mine and I started by asking the above question the same way. When I thought of the problem, it seemed very clear to me that the question was "what is $\mathbb{P}(blue \cap ball)$." I thought the calculation was generally accepted to be $$\mathbb{P}(blue \cap ball) = \mathbb{P}(blue) \cdot \mathbb{P}(ball)$$

When I asked my friend, he said, "it's impossible to know without more information." I was baffled because I thought this is what one would call "a priori probability."

I remember taking statistics tests in high school with questions like "if you roll two dice, what're the odds of rolling a 7," "what is the probability of flipping a coin 3 times and getting three heads," or "if you discard one card from the top of the deck, what is the probability that the next card is an ace?"

Then, I met math.stackexchange.com and found that people tend to talk about "fair dice," "fair coins," and "standard decks." I always thought that was pedantic so I tested my theory with the question above and it appears you really need to specify that "the toys are randomly painted blue."

It's clear now that I don't know how to ask a question about probability.

  • Why do you need to specify that a coin is fair?
  • Why would a problem like this be "unsolvable?"
  • If this isn't an example of a priori probability, can you give one or explain why?
  • Why doesn't the Principle of Indifference allow you to assume that the toys were randomly painted blue?
  • Why is it that on math tests, you don't have to specify that the coin is fair or ideal but in real life you do?
  • Why doesn't anybody at the craps table ask, "are these dice fair?"
  • If this were a casino game that paid out 100 to 1, would you play?

This comment has continued being relevant so I'll put it in the post:

Here's a probability question I found online on a math education site: "A city survey found that 47% of teenagers have a part time job. The same survey found that 78% plan to attend college. If a teenager is chosen at random, what is the probability that the teenager has a part time job and plans to attend college?" If that was on your test, would you answer "none of the above" because you know the coincident rate between part time job holders and kids with college aspirations is probably not negligible or would you answer, "about 37%?"

  • 11
    At a Texas toy store, $10\%$ of the toys are guns, and $10\%$ of the toys are pink. What is the probability that if we draw a toy at random, we will draw a pink gun?2012-11-28
  • 1
    That's my point. I'd say that without further information you have a 1 in 100 chance of running across a pink gun: http://is.gd/wFED462012-11-28
  • 0
    Is this really a question? For example, in a math test "coin toss" always means fair unless otherwise stated, because it is tiresome to keep stating that the coin is fair throughout the section on probability. If you are trying to "make a point," as you say, this is not a discussion site, but a Q&A site.2012-11-28
  • 0
    Heh, yeah, I'm serious. I have a daughter whom I'll eventually need to give sample statistics questions to to help with her homework. In her math book, it'll say, "you flip a coin." On stack exchange, folks want to know, "is the coin fair?" I want to know why. I seriously don't understand how my original question is any different than "You have two 10 sided dice. One die is for shape and the other for color. There is one ball shape and one blue color. What're the odds of rolling a blue ball?"2012-11-28
  • 0
    When we are trying to solve a "real" problem using tools from probability theory, the first step is to produce a mathematical *model* of the situation. The assumption of independence is physically reasonable when we model successive tosses of a coin. But assuming independence is not *always* reasonable.2012-11-28
  • 1
    In Prob(blue $\cap$ black) = P(blue).P(Black), you assume independence. It's not always true.2012-11-28
  • 0
    @D.Patrick Your question is not the same as two die rolls, unless it is unknown whether the die are somehow linked magnetically. If you asked: There are 100 toys in a bag, and 10 of them are blue and ten of them are balls, how many are blue balls? Would it be possible to answer definitively, or do you not have enough knowledge to know? But the probability of picking a blue ball is just that number divided by $100$.2012-11-28
  • 0
    Also, you subject should probably match the introduction to your question. The question about "fair coins" only recurs much later deep in the body of your question, which makes it very hard to figure out what your actual goal is. In particular, it would have been very nice to know up front that you were concerned about it essentially as a potential teacher - that it is a pedagogical question.2012-11-28
  • 0
    @ThomasAndrews, I think you're getting at my confusion. I don't understand why you wouldn't just assume that the toys were randomly assigned a color. It strikes me as odd how fundamentally different people think the question is as written vs. "10% are randomly blue." If you were writing a math test for 9th grade math students, how would you write it? What about for college seniors? Grad students? At what age would you be annoyed that they kept asking, "are the toys randomly painted or did you paint them with some strategy?"2012-11-28
  • 0
    You wouldn't assume it because, unlike die rolls, which are independent by all measures that we have experience with, the toys in the bag are a fixed finite set of objects, and the conditions that you have placed on the problem allow the number of blue balls to be anywhere from $0$ to $10$. We can easily imagine examples where there are no blue balls in the bag. We can't easily imagine examples where the die rolls are similarly dependent - I suppose magnetically entangled dice might be an example. The contents of the bag are not "random." The random process is just picking a toy the bag.2012-11-28
  • 0
    I edited my original post appending a real life sample question that's similar to mine that I found on a math education website quiz. Perhaps it's the way we think that changes so dramatically as we get older?2012-11-28
  • 0
    By the way, I think it's worth noting that almost universally, everybody hates my original question but nobody has upvoted a single answer to this question. In fact, the only upvotes are one on the question itself and 4 for the comment, "At a Texas toy store, 10% of the toys are guns, and 10% of the toys are pink. What is the probability that if we draw a toy at random, we will draw a pink gun?" Of course, almost certainly this comment will change all of that, but it was true at the time. :)2012-11-28
  • 0
    The survey question is simply a bad question. Change *None of the above* to *Insufficient information*, and it becomes an adequate question, though not a particularly good one. Change it from multiple choice to free answer, and it becomes a perfectly reasonable question.2012-11-28
  • 0
    @AndréNicolas, +1 for the Texas store example... Yet a similar fraction of theoretical statistics results remain valid if independence or i.i.d assumption are dropped.2012-11-30

5 Answers 5

3

Being precise and specifying the conditions under which you are asking for a probability avoids ambiguity, and impacts the result!

In mathematics, it is important to avoid ambiguity in general, when that is possible, if and when you want an answer to the question you "meant" to ask.

Your blue ball question asked:

I have a bag of toys.
10% of the toys are balls.
10% of the toys are blue.

If I draw one toy at random, what're the odds I'll draw a blue ball?

I commented, asking for clarification, and posted my qualified answer, part of which stated:

"Note: we are assuming that "blueness" is uniformly distributed over all the toys. Otherwise, it may be the case that 10% of the toys are red balls, and 10% of the toys are blue blocks, in which case you have 0% probability that you'll draw a blue ball."

Note also that the answers two your question, in order to provide an answer, follow from the assumption that the qualities of "blueness" and of "ball" are independent of one another. This is rarely the case.


Without being precise, there are many possible answers, depending on the conditions for which an event is being "probabilized". Stipulating "fair die", "standard deck", and "uniformly distributed" all rule out the

"but what if....?

questions.

  • 0
    That makes sense. So, on a math test, it's OK to just ask it the way I did because you don't have to assume the person asking the question has any additional information (i.e., there's no subsequent what if). Similarly in a casino, you assume that the gaming commission is regulating the dice. I thought your answer was perfectly reasonable and I expected exactly that. I didn't expect to be asked for clarification because I didn't consider the potential for the asker to say, "Yeah, well what if there's no such thing as a blue ball?" So, what about the casino scenario? Would you play?2012-11-28
  • 0
    I'd say that that problem is a horrible math test problem, since it is not well-defined. A math test probably doesn't have to specify that a coin toss is fair because in all probability, "coin toss" is just a metaphor for an event with two results of equal probability, unless the problem specifically uses the term "unfair coin."2012-11-28
  • 0
    (Or rather, I'd say it was a horrible math test question unless it was multiple choice and one of the choices was "not enough information to decide." It's certainly a valid part of learning probability to learn when you don't have enough information to determine a probability.)2012-11-28
  • 0
    Here's a probability question I found online on a math education site: "A city survey found that 47% of teenagers have a part time job. The same survey found that 78% plan to attend college. If a teenager is chosen at random, what is the probability that the teenager has a part time job and plans to attend college?" If that was on your test, would you answer "none of the above" because you know the coincident rate between part time job holders and kids with college aspirations is probably not negligible or would you answer, "about 37%?"2012-11-28
  • 0
    With survey results one should *always* assume dependence because here the same individual answers afew questions, the answers wo all of which are an expression of a complex inner state/opinion/life history of that person. In fact, most such surveys *intend* to detect such correlations. With probability devices such as coin tosses, dice rolls, one may *always* assume independence because the physical interaction between two experiments (unless by a precision coin flip robot) is hyper-sub-negligible. Note that independence is not the same as fairness. Unfair coin tosses are still independant.2012-11-28
  • 0
    (cont) Of course I meant only different die rolls etc. to be independant, not arbitrary events related to a roll. $P(X \text{ odd})=\frac 12$, $P(X\le 3)=\frac12$, but $P(X \text{ odd and }X\le 3)\ne \frac14$.2012-11-28
  • 0
    @HagenvonEitzen, first, I never really thought about your last point about the dependent probabilities of a die roll. Regarding your initial point, are you saying that painting toys blue would be dependent on the shape of the toy and that's why it's not simply .1 * .1?2012-11-28
  • 0
    @D.Patrick I think, the toy problem is closer to the survey situation than to independant coin tosses. There is more reason to think they are dependant (e.g. toy ponies are more likely to be pink) than not.2012-11-28
2

I think you're missing the point here. If a maths problem doesn't specify one way or the other whether a coin is fair, then it's OK to assume that it's a fair coin, because that's the way things usually are. Coins are fair, by and large.

But it certainly is not the case that the colour of a toy is unrelated to what kind of toy it is. I would say that the probability of a ball being blue is much greater than the probability of a toy in general being blue.

So the question is fundamentally flawed.

  • 0
    I posted this on another answer, I'd like your thoughts too. I found this on a math education site: "A city survey found that 47% of teenagers have a part time job. The same survey found that 78% plan to attend college. If a teenager is chosen at random, what is the probability that the teenager has a part time job and plans to attend college?" If that was on your test, would you answer "none of the above" because you know the coincident rate between part time job holders and kids with college aspirations is probably not negligible or would you answer, "about 37%?"2012-11-28
  • 0
    That's just a stupid question. I mean the one about the teenagers, not yours! In a maths test I would answer "about 37%", unless there was a box for "Insufficient information". And I would feel smugly superior to the examiner.2012-11-28
  • 0
    Hahaha. That's funny. I guess people smarter than me look at the world differently. :) To me it just made sense. I think it's much like Sir Ken Robinson talks about the paperclip experiment. "How many things can you do with a paperclip?" The older we get, the fewer things we think of. The younger we are, the more likely we are to say, "can it be 100 feet tall and made of rubber?" Kids hear "paperclip" and think, "that's my only restriction." Adults hear "paperclip" and think, "can you restrict it further?"2012-11-28
  • 0
    @D.Patrick: The problem is that erroneous statistical thinking is not about lack of creativity, it leads to wrong conclusions that lead to wrong decisions. And the "wrong" is not an ethical or political category here. It's a political decision to, say, want to improve literacy, but it is not a political decision to tell from statistical data whether a particular action improved literacy.2012-12-04
0

the principle of indifference is required to compensate for the lack of precision but it might lead to some paradoxes.

the principle of indifference does not imply any independence it only assigns uniform probabilities to disjoint events:

if we are to apply the principle of indifference to those three sets:

{it will rain in Rome tomorrow it will not}

{it will rain in Paris tomorrow,it will not}

{It will rain both in Paris and Rome tomorrow,it will not}

we will end up having a clear paradox because applying the principle of indifference + independence to the first two sets will lead to a probability of 1/4 of it raining in both cities tomorrow instead of 1/2 deduced from mere application of the indifference principle.

0

$1$-You need to specify that a coin is fair so that you have a reasonable model, namely a completely specified probability density function.

$2$- such problems are even solvable in a well defined sense, for example Minimax. However you still need to know the range of fairness of the coin, i.e., how much unfair or the bias should be known.

$3$- aprioi probability is the probability of occurrence of your hypothesis. One of the best examples can be the binary communication where the transmitter sends either $0$ or $1$ from a random source. This means the apriori probability of sending $0$ and sending $1$ at a channel use is roughly $0.5$. However the channel introduces noise and the probability that a $0$ sent is received as $0$ can be different from a $1$ sent is received as $1$. The a-posteriori probability is the multiplication of the a priori probability with the conditional probabilities such as mentioned "probability of a $0$ sent is received as $0$". Clearly if the a priori probability of sending $0$ is $0$ then all the time $1$ will be sent and whatever the channel condition is you will get a perfect decision at the receiver. But of course the entropy of this source is $0$ and such a communication doesn't make sense

$4$- if you see in a math test that the fairness is not mentioned it is their assumption or in a wide sense incompleteness.

$5$- the fairness of a dice or correlations among some events is not of only mathematical importance. They all appear in the nature and there has been alot of research going on in this matter. Just a little example: if you have two sensors sensing the room temperature and if you situate them very closely, they will observe similar temperature values as they will have correlated observations. Now how is this linked to a fair dice? If you throw an unfair dice some numbers will occur more than a probability $1/6$ and some with less that that of. Then one can simply deduce that the number $5$ comes very frequently and play to that number. The relation to temperature sensors is generally related in time, namely to the correlations in time. If at time $t$ you observe the number $5$ then for example at time $t+1$ you observe $4$ with a probability say $90\%$ the same story that a sensor at time $t$ would observe almost the same thing in time $t+1$ or the second sensor which is close to the first one will have similar observations. Note that if the observations have correlations, then the entropy will be much less and the fairness will be reduced and the diversity will be also much less. As a result according to Shannon, the information with respect to that source will be less.

  • 0
    Hey, thanks for editing that. I read the entire answer a few times and but it was kind of hard to follow without the whitespace.2012-11-28
  • 0
    D.Patrick edited once again. It was sent by mobile that was the reason actually.2012-11-28
  • 0
    You typed all of that on a mobile device!?2012-11-28
  • 0
    yes I was coming from the univ to home. Decided to spend the time for posting something..2012-11-28
  • 0
    Well, thanks! :)2012-11-28
0

Probability theory is the study of theoretical random processes that have no essential unknowns, i.e., all aspects are determined. The "unknowns" of problems are merely aspects that must be determined from others that are given. This is precisely why a coin or die etc. is said to be fair. If it is NOT fair, sufficient other characteristics MUST be given to DETERMINE it's probabilities; or what is asked for must NOT depend on its probabilities - which is unlikely.

This is in stark contrast to statistics which is the study of random processes that HAVE essential unknowns. So statistics is essentially the study of estimation. There is no such concept in classical probability theory.

Modern theoreticians have attempted to blur (they call it "generalize") probability theory to include what really is part of statistics; but it waddles like statistics, quacks like statistics, looks like statistics, and IS statistics.

Probability theory has no concept of "prior" or "posterior" probabilities or distributions. These are strictly terms from statistics.

Finally, if the labels of the balls (or whatever) are not MUTUALLY EXCLUSIVE, the probability of drawing a specified one depends not only only on how many there are of each but also how many are in common. This is the idea of what is called "independence." The simple product formula for the union is ONLY valid for independence.