0
$\begingroup$

So my observations are as follows, with their individually corresponding probabilities:

18  => 10% 18  => 20% 40  => 20% 90  => 20% 90  => 10% 110 => 20% 

Should equal observations be summed as follows since I have to observations of 18 and two of 90?

18  => 30% 40  => 20% 90  => 30% 110 => 20% 

Does it give any meaning to calculate the 1st percentile (P1) for these observations? I can't figure out whether it should be 18 or 0. I need to calculate all percentiles from 1 to 100 in order to plot the cumulative distribution.

What is P10?

What is P30?

What is P100?

  • 0
    Unless I misunderstood the question, if you make a test case where you know the frequencies of each observation, I think you could convince yourself that it does make sense to just add those percentages.2011-05-11

1 Answers 1

1

You are right, it makes sense to bucket equal observations together.

Recall that the cumulative distribution function $F(x)$ for a random variable $X$ is defined to be $F(x) = P(X. Therefore you have

$F(x) = 0$ for $x<18$

$F(x) = 0.3$ for $18

$F(x) = 0.5$ for $40 < x < 90$

$F(x) = 0.8$ for $90 < x < 110$

$F(x) = 1$ for $110 < x$

The quantile function (sometimes called the percentile function) is the inverse of the cumulative distribution function. In the case of a discrete distribution (as you have) the cdf is piecewise constant, and the quantile function of $p$ can be defined as the infimum over all $x$ which satisfy $F(x)\geq p$, that is

$Q(p) = \inf \{ x : F(x)\geq p \}$

This means, for example, that P10 $:= Q(0.1) = \inf \{ x : F(x) \geq 0.1 \} = 18$.

Further reading: http://en.wikipedia.org/wiki/Quantile_function