0
$\begingroup$

I am working on a model which predicts the total final score (sum of the scores of both teams) of certain sports matches. I have checked the predictions of the model against the actual scores of matches for the last several years. The error in the predicted total score to the actual toal score is approximately normally distributed with mean $=-0.3$ and standard deviation $= 13$.

Say I have a predicted total score $x$, how can I determine the probability that the actual score will above or below a given score line $y$.

For example, say my model predicts that the total score will be $8$ and the score line is $5.5$, what is the probability that the actual score will be $\le 5$ and what is the probability that the actual score will be $>5$ (given that the errors in my prediction can be assumed to be normally distributed)?

Cheers,

Pete

  • 2
    One warning: Just because you can follow the calculation of a standard deviation doesn't mean the distribution is Gaussian. Most real world distributions have larger tails than the Gaussian would predict, so the likelihood of a value farther from the mean than 3 sigma (for example) is much larger than 1 in 370. Closer to the mean things aren't so bad. One of my favorite remarks goes something like "Physicists think the Gaussian distribution is a mathematical theorem, mathematicians think it is an experimental fact."2010-12-29

1 Answers 1

1

OK, sorry for misunderstanding your point. So, your model predicts $PV=8$ with errors $\epsilon \sim N(-0.3,13)$. Or, this is equivalent to saying that you expect the actual value to be $AV \sim N(8.3,13)$. ($PV=AV+\epsilon$)

Now, to compute the probability that $X\leq 5.5$ (or x>5.5) under the hypothesis that your model is correct. This is

$\mathbb{P}(X\leq 5.5|X \sim N(8.3,13)) \; .$

I don't know what programming language or other computer program you use, but most math or statistics interpretative languages (Matlab, R, etc...) have the necessary commands inbuilt. So in R, you would use the command

pnorm(5.5,8.3,13) 

which computes the cumulative normal distribution in point $5.5$ for mean and variance as given. For the probability bigger than $5.5$, you take

1-pnorm(5.5,8.3,13) 

If you want the explicit formula for the computation, I can provide that too, but unless you want to make a lot of extra work implenting the functions yourself, I don't think that is really useful.

  • 0
    Just one last thing to be certain, is the relationship between Predicted Value (PV), Actual Value (AV) and error ($\epsilon$) as follows: $PV=AV+\epsilon$ or is it $AV=PV+\epsilon$ ? This is also important. Because in the first case, the mean should be $8-(-0.3)$ in the second it should be $8+(-0.3)$.2010-12-29