0
$\begingroup$

I'm trying to use odds data from bookmakers to estimate the expected number of goals in a game. We have these known facts:

  • P(o4.5) = 0.573
  • P(o5.5) = 0.458
  • P(o6.5) = 0.279

P(o4.5) is the probability that there will be more than 4.5, P(o5.5) more than 5.5 and so on. The probability that there will be less than 4.5 is the inverse: 1 - P(o.45).

Can you from these facts estimate the expected number of goals in the game? Something like

P(oX) = 0.5

What is the value of the X variable? Is the information given above enough to figure it out? Is it enough if you make some assumptions like on the shape of the goal distribution and so on?

  • 0
    $P(\text{Goals} \ge 4.5) = 0.353 < 0.458 = P(\text{Goals} \ge 5.5)$ doesn't make sense IMHO ... if there are at least 5.5, there are at least 4.5 also ...2012-03-22
  • 0
    How can the probability that there be more than 4.5 be lower than the probability that there be more than 5.5? And what is a half goal anyway? What kind of game is this?2012-03-22
  • 0
    Sorry I suck. I have updated the numbers. Half goal is a convenient notation used by bookmakers. P(oX.5) means (x+1) or more goals. P(uX.5) means X or less goals.2012-03-22
  • 0
    @Raskolnikov the half goal is done to exclude confusion. If I say "more than 4.5" it means 5, 6, 7,... goal. If I say "more than 4" maybe i can have doubt if the 4 is included.2012-03-22
  • 0
    @Kolmo: I see, but wouldn't it be easier to use $>$ instead of $\geq$ to avoid such confusion. Anyway, I guess every field has its own conventions.2012-03-22

1 Answers 1

3

There's no natural solution to this problem. In a sense, this is because you've specified both too much and too little. If you'd specified the probability that there will be more than $n$ goals for all non-negative integers $n$, then it would be straightforward to calculate the probability that there will be exactly $n$ goals for all $n$. On the other hand, if you'd specified only one or two probabilities, one might use a model and use the specified value(s) to determine its parameter(s). For instance, a natural model for the number of goals would lead to the Poisson distribution. This has a single parameter (usually denoted by $\lambda$), which you could fix by specifying a single probability. One might consider other distributions, such as a binomial distribution or a normal distribution truncated at $0$. However, all the obvious candidates have at most $2$ parameters, so they would be overdetermined by the three values you specify. You'd have to make some specific and probably to some degree arbitrary assumptions to come up with a model that has $3$ parameters to fit.

You could consider performing a non-linear least squares fit of a Poisson distribution to the probabilities you've specified. The data fit that model reasonably well; here's a table of values for $\lambda=5.25$ that gets within $0.03$ of your probabilities; this corresponds to an expected number of $5.25$ goals. (Here $\Gamma(n,\lambda)/\Gamma(n)$ is the probability that there will be less than $n$ goals.)

  • 0
    How did you come up with 5.25? Is there a formula that produces that number given the above input? For my purposes two parameters is definitely enough and I only included three because I thought that would make the problem easier. :) In reality, there is usually only one, often two and more seldomly 3 or even more parameters.2012-03-22
  • 0
    @Björn: I got it by trial and error. I doubt that there's a closed formula; as I wrote, you'd have to do a non-linear least squares fit to get the optimal value. If you have only one probability of the type you specified and you want to find the parameter $\lambda$ of a Poisson distribution that yields that probability, that amounts to inverting $\Gamma(n,\lambda)/\Gamma(n)$ considered as a function of $\lambda$; an algorithm for that is given [here](http://www.deepdyve.com/lp/association-for-computing-machinery/computation-of-the-incomplete-gamma-function-ratios-and-their-inverse-6RUMpwMNcR).2012-03-22