2
$\begingroup$

Consider $X\sim Bin(40, 0.25), Y\sim Bin(80, 0.25)$. It is easy to calculate that:

$$P(X\gt 15) \simeq 0.0262\\ P(Y > 30)\simeq 0.004$$

My question is why we're getting those results? At a first glance, one could assume that those probabilities would be equal, since $15/40 = 30/80$.

How could those results be explained then? Is it has something to do with CLT?

  • 1
    Have you looked at, say, the variances? This may already give you a hint. (To see how "peaked" the two distributions are, i.e. concentrated around the mean.)2017-01-11
  • 1
    Here is an interesting, related question. http://math.stackexchange.com/q/2036557/3714192017-01-12

1 Answers 1

2

As @ClementC has commented, you need to take variation into account.

For $X \sim Binom(40, 1/4),$ we have mean $E(X) = \mu = np = 10,$ and $SD(X) = \sigma = \sqrt{np(1-p)} \approx 2.74.$ Thus a random observation $X = 15$ lies only $Z = (15 - \mu)/\sigma \approx 1.83$ standard deviations above the mean.

For $Y \sim Binom(80, 1/4),$ we have mean $E(X) = \mu = np = 20,$ and $SD(X) = \sigma = \sqrt{np(1-p)} \approx 3.87.$ Thus a random observation $X = 30$ lies $Z = (30 - \mu)/\sigma \approx 2.58$ standard deviations above the mean.

These 'z-scores' or 'standard scores' are used in approximating binomial probabilities with the standard normal distribution, an approximation suggested by the CLT. For example, $$P(X \le 15) = P(X < 15.5) = P\left(\frac{X - \mu_X}{\sigma_X} < \frac{15.5 - 10}{2.74}\right) \approx P(Z < 2.01) = 0.9778,$$ where $Z$ has a standard normal distribution. Thus, $P(X > 15) \approx 1 - 0.9778 = 0.0222.$ As you say, the exact binomial value is $P(X > 15) - 0.0262.$ [Generally, don't expect more than 2-place accuracy from a normal approximation.]

The following figure illustrates the desired probabilities (above the vertical red line in each case) in comparison with the two distributions.

enter image description here

  • 0
    Gosh, thanks for the detailed answer - I really appreciate it!2017-01-12
  • 0
    By the way, why did you calculate it for $X>15.5$ and not just $X > 15$?2017-01-12
  • 0
    'Continuity correction'. Binomial event $\{X=15\}$ is approximated by normal event $\{14.5 < X \le 15.5\}.$2017-01-12