5
$\begingroup$

I want to find the probability that my student is a random guesser. On a 360-item multiple choice test with four choices for each question, he got 28.5% or 103 of the questions correctly.

Here is what I have so far. As everyone knows, the expected score is 25% or 90 items. Assuming that he is indeed a random guesser, I used the binomial distribution to get the variance np(1-p) = 360(.25)(.75) = 67.5; hence, a standard deviation of 8.22. Further assuming that random guessers are normally distributed, his z-score is (103-90)/8.22 = 1.58, making him an outlier. This places him in the top 6% of random guessers. This suggest that either (1) he is a very good guesser, (2) he is a very lucky guesser, or (3) he is NOT a random guesser at all.

Now I don't know what other concepts to use to find the probability that he is a random guesser. I don't even know if there is enough information; nor do I know whether all my computations and assumptions make any sense. I hope you can help. Cheers!

PS: I only had a 3-unit statistics course way back in college. "Dummifying" your explanations would surely be appreciated. Cheers! :-)


Edit: Thanks for all your help. So I guess it's really not that easy to get a good approximation on the said probability.

Having said that, is there a relatively simple way to get even a very crude approximation of the answer? For instance, even before posting the question here, I actually considered the Bayesian probability mentioned above. To make things simple, I assumed that P(getting 103|guesser) is simply ${{360}\choose{103}}*.25^{103}*.75^{360-103}.$ And just to have a starting point, let's just say that 1 out of 5 students are random guessers, so P(guesser) is 0.2. What would be a reasonable initial estimate, albeit inaccurate, for P(getting 103)?

Then maybe we can play around with the assumed values later to get a spectrum of possibilities.

  • 0
    You should run your students through the same test again and again. Questions they know will receive constant answers, questions they guess won't :)2012-09-09
  • 3
    Which _alternative_ to being a "random guesser" are you considering? We can conclude definitely that the student is not someone who knows the correct answer to all questions. What, then, would you have him do with the rest of the questions _other_ than guess randomly? It does not make sense to assign a probability to "being a random guesser" unless you can suggest something _different_ that would also have positive probability.2012-09-09
  • 0
    There might be a 4th possibility: out of 360 questions, he correctly solved around 10 to 15 problems, and guessed the rest. It can be calculated approximately how many did he solve to achieve an expected score of 25% for the guesses.2012-09-09
  • 0
    It's not that it's hard to get a good approximation for the desired probability; the problem is that you haven't actually defined any probability. As Henning said, it all depends on the alternatives, and the difference they make isn't just an approximation error, it's the entire answer. If you assume that the proportion of answers the student knows is uniformly distributed between $0$ and $1$, then the probability that it's $0$ is $0$, whereas if you assume that it's either $0$ or $1$, then the probability that it's $0$ is $1$.2012-09-12
  • 0
    @Mark: I've merged your two accounts. Please consider [registering your account](http://math.stackexchange.com/users/login). This helps the software keeping track of you so that you retain the ability of commenting on this thread, editing your question, etc.2012-09-12
  • 0
    A true random can be on outside the probability range, because we often expect a random sequence to be uniformly distributed, but in real it can be nothing like you would expect. This is a very informal video that explains what I'm saying http://youtu.be/Lf4ZmWc_jmA?t=8s2013-03-22

2 Answers 2

5

Once you know the distribution (in this case binomial as you correctly stated), you can use Bayesian probability to ask what the chances are the student is randomly guessing: $$P(\text{guesser}\mid\text{got }103) = {P(\text{getting }103\mid \text{guesser})P(\text{guesser}) \over P(\text{getting }103)}$$ Note that this depends on your prior belief as to how many students you think don't know anything..

  • 0
    Upvoting out of embarassment that I forgot you need to consider the prior.2012-09-09
  • 0
    But what to take as $P(103)$? The actual percentage of students getting 103? What if five students got 102, five got 104 and only one got 103?2012-09-09
  • 0
    @HagenvonEitzen: using knowledge of the problem, make some modelling assumptions, come up with a distribution that seems plausible, and use that.2012-09-09
  • 1
    You only need a prior if you are doing Bayesian inference. You can apply the Neyman-Pearson (frequentist) approach to test the null hypothesis of random guessing using only the assumed binomial with p=1/4 for the null hypothesis of random guessing using their approach.2012-09-09
  • 0
    @Hagen: The marginal probability $P(103)$ in the denominator is just a normalizing factor that depends on what strategies _other_ than random guessing the students might have available, and how likely you find each of them _a priori_. Specifically, $P(103) = \sum_\mathrm{strategy} P(103 | \mathrm{strategy}) P(\mathrm{strategy})$. As for the prior strategy probabilities $P(\mathrm{strategy})$, those really are to some degree subjective and arbitrary.2012-09-09
3

Your computations make sense. However, one can never be absolutely sure. But you know that if 100 students relied totally on guessing then about 6 are expected to have a score of 103 or better. It all depends on the confidense you want.

Also, it is possible that the student knows some answers correctly and guesses the rest. While a complete guesser scores $90\pm 8.22$, a student actually knowing 20 answers and guessing the rest scores $20+85\pm 7.98$, that is the student with a score of 103 might be a slightly unlucky guesser with little knowledge. In fact a score of $k$ out of $n$ suggests by some kind of maximum-likelihood argument that the student actually knew $\frac{4k-n}3$ answers and guessed the rest.

This does not even take into account "educated guessing". That is: a student with little knowledge mightsee that one of the 4 answers is obviously wrong but he does not know how to find out about the other three options. Each such question would add $\frac1{12}$ to the expected score (but also lower the deviation).

  • 0
    thanks for your answer. It's very helpful. But I'm not sure I understood you here: "Each such question would add 1/6 to the expected score (but also lower the deviation)." Shouldn't it be $\frac{1}{3}-\frac{1}{4}=\frac{1}{12}$? But I might be wrong. :-)2012-09-12
  • 0
    Yes, I'm a guesser for fraction additions2012-09-12