I'm having some trouble describing the skewness of some really rough histograms. For the following two histograms, is it correct to say that the first one is left-skewed, and the second one is right-skewed?
Skewness of very rough histogram
-
1I think it's a bit hard to tell at this level of roughness. If you had twice as many bins, it would be much easier to see. – 2017-02-26
-
0That said, if you just assume all the data is at the midpoint of each bar (which is a bad assumption, but technically the best you can do), then you can just calculate the skewness, preferably in software. In this case you find that the first one is left-skewed and the second is right-skewed. – 2017-02-26
-
0@Ian I guess if the only source of information is the histogram, that would be the best way? – 2017-02-26
-
1I mean it's probably not the *best* thing you can do, for instance you could assume the data is uniformly distributed within each bar or something like that. (I tried this, the outcome is the same.) Ideally you would know a little bit about the source of the data to come up with a "refined" histogram (or you would just have the data so that you could properly refine the histogram as needed). – 2017-02-26
1 Answers
Symmetrical samples and distributions are not skewed. If a sample or distribution has a tail of small frequency or probability extending in one direction from the mean, then it is called 'skewed' in that direction.
The sample sizes illustrated in your histogram are very small. Trying to judge skewness from small samples can be misleading as indicators whether the sampled population is skewed. Another sample of the same size or greater may well show no skewness or even skewness in the opposite direction.
However, if you are given these two histograms, told that they represent skewed samples, and are asked the direction of the skew, I would say that the first one is to the left and the second one is to the right.
Note: Below are histograms of four samples, each of size $n=40.$ Reading from left to right, the first three are from a symmetrical normal population; it happens that histograms of the first and third might be called slightly 'right-skewed', and I see no obvious skewness in the second. The histogram at bottom right shows a sample from a severely right-skewed exponential population, and that skewness is accurately reflected in the histogram.


