2
$\begingroup$

Average survey results

I'm curious which way is correct when rounding whole numbers. My issue is that the customer wants decimal places included in the average when the User only ever will select whole numbers in a range from 1-5. In my head it makes sense to return a whole number instead of a rational or irrational number.

Results out of 5 questions:

Question1 = 4

Question2 = 3

Question3 = 5

Question4 = 3

Question5 = 2

SUM = 17

which is correct for the average 3 or 3.4?

  • 0
    This really belongs on the stats & data-visualisation site. See the wikipedia article on the Likert scale http://en.wikipedia.org/wiki/Likert_scale2011-08-03

6 Answers 6

6

Depends on whether an arithmetic mean or a median is appropriate. Both have advantages and disadvantages. An arithmetic mean is more expressive, but subject to gross distortion by outliers. A median is much more robust and guaranteed to give a representative number, but says almost nothing about the overall distribution.

Without knowing what the numbers describe (satisfaction? grades? household size?) there is no way of telling you which one is "right".

  • 0
    3.4 represents a point on a scale from$1$to 5. If the customer wants to measure changes in satisfaction over time, it creates a baseline against which changes can be measured in a way that reporting an ordinal number doesn't. It isn't an answer that anyone can give, but it is a real measure of the balance of those answers. My guess is that the customer might also want to know the reasons for votes of 1 or 5, with a view to getting the numbers up. The real issue is to understand what your customer wants, and to provide information which will enable them to monitor and improve performance.2011-08-03
3

Simple but comprehensive guide to rounding. And that's just the techniques. There is no "correct" way. For example, we often refer to "2.4 children" as the average per family, even though obviously no family actually has this number of children. On the other hand, perfectly sensible averages can be nonsensical if the values do not correspond to some linear progression. For example:

  • If "Yes" and "No" are represented by 1 and 0 in the database, the output should say how many percent answered one or the other instead of the mean (although they are mathematically identical).
  • If the input is logarithmically distributed (options are e.g. 1, 10, 100, etc.), it might make sense to return the logarithmic mean.

By the way, you could never end up with an irrational number for the average unless your set of numbers is literally infinite.

  • 0
    thank you for the link, this will prove very useful2011-08-03
2

3.4 would be mathematically correct. If fractions do not make sense for the result, you may want to round, but then, if rounding doesn't make sense eitherm other statistical measures than the average (e.g., the median or the mode) are probably more interesting.

An irrational number never makes sense, because the average is the ratio of the sum of scores over the number of questions, both of which are whole numbers (and thus also rational); the ratio of two rational numbers is also rational (this follows directly from the definitions of ratios and rational numbers).

  • 0
    It depends whether you actually regard them as categories, or as a continuous scale (with respondents rounding their response to the nearest whole number). That's what I meant by 'rounding makes sense' - if they're points on a scale, rounding makes sense, if they're categories, it doesn't.2011-08-03
1

The correct answer for your job is that you should do what the customer wants.

However, from a mathematical/statistical perspective there is a clear answer: you should return the median of this data, not the mean (i.e. you should return '3', not '3.4').

If you are reporting the mean, you are presupposing that it makes sense to add the data together.

To see that it makes no sense to add the data in your case, consider that the numbers 1-5 are just labels for how the customer feels: they don't represent quantities. You could equally have used the five categories 'Very bad', 'bad', 'neutral', 'good', 'very good' instead of the numbers 1-5, in which case it would be obvious that adding together these categories makes no sense. It does make sense to order the categories, however, so you can always return the median (in this case the median is 'neutral').

To return the mean instead of the median is deceitful: it creates a sense of quantifiability which is not warranted from the data.

  • 0
    Great answer! Yes, I always do what the customer wants. In this case the numbers do represent categories, so, according to your statement about the categories, 'very bad', 'good', ..etc, it makes sense not to round the numbers, since the numbers are just a representaion of categories.2011-08-03
0

I would say to use at least the one decimal place (3.4) as in an average you are looking for a good level of resolution (significant statistically) and having 5 levels is too coarse.

0

I think 3.4 is the right answer, especially if the customer wants it...

Average numbers usually have decimals, even if the numbers don't. One common example would be average number of kids, which in Europe is some where around 1.5.

  • 0
    I agree you should almost always give the customer what they want, they are the ones paying for it, but not always.2011-08-03