3
$\begingroup$

When we plot the frequency histogram, the frequency is equal to the area of the bar or the height of the bar? Is $\text{height} = \frac{\text{frequency}}{\text{size of the class}}$ ? If so, then how to label the vertical axis?

For example, based on the following data,

Class   Frequency [0,9)      10 [10,19)    20 [20,39)    40 

Then what is the label for vertical axis? Is it "frequency" or "frequency density"? The height of first class is 10 or 1.0? For the 3rd class, is it 20, 2.0, or 40?

Possible solution (or answer)
Based on the information I studied from several links, when the width of the class is different, one needs to use "Frequency density" for the vertical axis. Then, $\text{frequency} = \text{height} \times \text{width of the class}$

If the class are in the same width, then one can use "Frequency" for the vertical axis, then all the heights of the bars are point to the value directly.

  • 0
    But you can use either frequency or relative frequency, as you said, as long as you respect the proportionality principle. If you use net frequency, the units on the y-axis would be just the units used for the variable y, e.g., days of rain, lbs (e.g., if doing a histogram for the weights of a collection of students ). If you are doing a relative frequency histogram, then you just use % .2011-09-07

2 Answers 2

4

Do it in such a way that the area of each box is proportional to the number of data points, or to the probability (depending on what kind of histogram it is).

The units on the vertical axis should be the reciprocal of the units on the horizontal axis, since when you multiply them to get probability, it needs to be dimensionless. For example, if $x$ is in inches, $y$ is in units per inch or percent per inch.

0

The vertical bar can be used to describe either the net frequency or the relative frequency, as you said. In your example, the vertical bar would describe the frequency, and you can then label the y-axis as frequency; otherwise, you can label it relative frequency.

EDIT: Re the issue of of the height: As long as the bases of the rectangles have equal width (which I initially assumed in my answer)the height you use does not really matter , as long as there is a proportionality between the frequencies and the heights of the bar. if you choose a height of 10 for the interval [0,9) , then you should use a height of 20 for the interval [10,19)(2 times 10, since frequency of values in [10,19) is twice that of the values in [0,9)), and a height of 40 for the third interval [20,39). Similarly, you can choose a height of 1.0 for the value along [0,9), but then the height above [10,19) should be 2.0, and the height above [20,39) should be 4.0 . This last is an incomplete version of the Area Principle. A complete version of the Area Principle allows for the rectangles (i.e., the partition along the horizontal axis) to have different width. All that really matters then is that the proportion between the areas be the same as the proportion between the frequencies or between the relative frequencies.

  • 0
    Gary, thanks. But the relative frequency is only make the total into 1.00. It seems like nothing related to the width of the class size.2011-09-08