1
$\begingroup$

For a normal approximation of the binomial to be valid,

$np > 5$

$n(1 - p) > 5$

What is the algebraic proof for this and where does the $5$ come from?

  • 0
    There is no proof. I imagine that beyond, the global error is very acceptable, less, say, than 1%.2017-01-19
  • 0
    A very interesting article on this subject is here: http://stochastik-in-der-schule.de/sisonline/struktur/jahrgang21-2001/heft1/2001-1_Eich.pdf Beware, it's written in German.2017-01-19
  • 0
    Interestingly enough, that article seems to refer to a different rule of thumb, namely that the approximation is *sufficiently accurate* when $np (1 - p) \gt 9$.2017-01-19

1 Answers 1

1

As @JeanMarie Comments, there is no proof. This is a very rough rule-of-thumb. I suppose it survives because it is considered 'easier to remember' than better criteria. (And it is a little better if you use 10 instead of 5.) You can see a discussion of several better 'rules' in the Wilipedia article on 'binomial distribution' under normal approximation.

Especially for small $n,$ a major consideration is that normal approximation to normal works best for $p$ near 1/2, so that the binomial is nearly symmetrical, and thus easier to approximate with the symmetrical normal.

One of the easiest to remember of the better rules is to have both $np/q > 9$ and $nq/p > 9,$ where $q = 1-p.$ This ensures that binomial values in $[0,n]$ correspond to standard normal values in $(-3,3)$.

With the current wide availability of statistical software, there is seldom a reason in practice to use a normal approximation. For example, if $X \sim Binom(100, 0.1),$ then the exact value of $P(X \le 5) = 0.0576$ can be obtained in R statistical software as follows:

pbinom(5, 100, .1)
## 0.05757689

But about the most accurate normal approximation with the usual method (including continuity correction) is 0.0668, obtained by:

n = 100;  p = .1;  q = 1-p
mu = n*p;  sg = sqrt(n*p*q)
pnorm(5.5, mu, sg)
## 0.0668072

Generally, it is not prudent to expect more than two-place accuracy for a normal approximation, although there are even cases that violate rules of thumb and still give surprisingly good results. Consider $P(X = 2) = P(1.5 < X \le 2.5) = 3/8 = 0.3750,$ with $X \sim Binom(3, 1/2)$. The normal approximation with continuity correction gives 0.3759.

n = 3;  p = .5;  q = 1-p
mu = n*p;  sg = sqrt(n*p*q)
diff(pnorm(c(1.5,2.5), mu, sg))
## 0.3758935

Reference: J. Pitman: Probability, Springer, 1993 gives a more sophisticated and accurate method of normal approximation than the one in general use.

  • 1
    I got it now: $np > 9q$, and because $q \leq 1, np > 9$, or $np \geq 10$. If we dumb it down to $2$ standard deviations, $np \geq 5$. Thanks for the insight!2017-01-23
  • 1
    Great! Wish I'd said that.2017-01-23