2
$\begingroup$

I have a form where users can rate presentations and they can say how knowledgeable they are on the subject. The range for both sets is 1-5. (1 being lousy and 5 being great) So for example, a user can rate a presentation with score 1 and their knowledge 5, which means they are very sure that the presentation was bad.

A presentation can be rated by two distinct people who don't know what the other person rated. If these scores are far apart, a third rater should come into play who acts as a tiebreaker.

What I need is a way to calculate the difference between the two distinct ratings on which I can decide whether or not I should ask the tiebreaker to rate. Obviously it should be some sort of weighted difference. If we go down this path, it could be implemented as follows:

(score person A)(knowledge person A) - (score person B)(knowledge person B)

However this doesn't have the desired result, because for example 3*2 - 1*5 = 1 is a very small difference whereas person B is really sure about his rating so a tiebreaker should probably come into play here. On the other hand 5*5 - 4*5 = 5 is a big difference but both raters are very confident that they know what they are talking about so a tiebreaker should NOT come into play.

What I think would be of help is if somehow the knowledge factor is not linear but progresses along a sort of bell curve. Any ideas on how to come with a better algorithm would be appreciated.

2 Answers 2

1

Elaborating on jpalecek's suggestion, I would fit a best average rating $x$ in the least-squares sense, weighted by experience; if $w_i$ are the experiences and $x_i$ are the ratings, this amounts to minimizing the energy $\frac{w_1 (x-x_1)^2 + w_2(x-x_2)^2}{w_1+w_2}$ which has solution $x=\frac{w_1 x_1 + w_2 x_2}{w_1+w_2}.$ Plugging this minimum back into the energy (and normalizing to get a number between 0 and 1) gives you a measure of how bad the fit is, $E=\frac{w_1 w_2 (x_1-x_2)^2}{4(w_1+w_2)^2}.$ $E$ has some of the common-sense properties you want: $E$ is higher if two people of equal experience disagree than if two people of disparate experience disagree, and for fixed values of experience, increases as the disagreement in ratings increases.

  • 0
    My bad :) You are right2011-11-09
1

I'd go for a weighted mean square error or maybe weighted mean absolute error. YMMV.