2
$\begingroup$

Let $Y$ be a $N\left(p, \frac{p q}{n}\right)$-distributed random variable. We want to minimize $ \delta = \delta(n)$ so that the possibility that $p$ covers the interval $ J = [Y-\delta, Y+\delta] $ (for every $p$) is at least $0.95$ ($95\%$).

which means $\frac{p q}{n}$ in the normal distribution $N\left(p, \frac{p q}{n} \right)$?

Does anyone have any advice how I could solve it?

  • 0
    You just asked essentially the same question for the average of $n$ Bernoulli$(p)$ random variables (and thus mean $p$ and variance $p(1-p)/n$ [here](http://math.stackexchange.com/q/88665/15941). Couldn't you have at least linked the questions together? I just spent time answering the other one.2011-12-06

1 Answers 1

2

If $X$ has normal distribution with mean $\mu$ and variance $\sigma^2$, then with probability $0.95$, $\mu -1.96\sigma \le X \le \mu +1.96\sigma. \qquad\qquad(\ast)$ The $1.96$ is (approximately) the value $z$ such that with probability $0.025$, $Z > z$, where $Z$ has standard normal distribution. Similarly, if we want to have probability of only $0.005$ in each tail, we would use $2.57$ instead of $1.96$. In the old days this kind of information was found in tables. Now the tables are available online, and the information is also a built in feature of spreadsheet and other programs.

In our case, we are assuming that $Y$ is normal with mean $p$ and variance $\frac{pq}{n}$. Because of the standard notation that is used in the question, I expect that $p$ is a probability, that is, $0 \le p \le 1$, and $q=1-p$.

We want a $\delta$ that, as the post says, works for every $p$. Note that $p(1-p)$ takes on its maximum value at $p=\frac{1}{2}$, so our variance is $\le \frac{1}{4n}$. Thus from $(\ast)$ we can see that for any $p$, with probability at least $0.95$, $p-1.96\frac{1}{2\sqrt{n}} \le Y \le p+1.96\frac{1}{2\sqrt{n}}.$ The above inequalities can be rewritten as $Y-1.96\frac{1}{2\sqrt{n}}\le p \le Y+1.96\frac{1}{2\sqrt{n}}.$ That means that in the notation of the problem we can take $\delta=1.96\frac{1}{2\sqrt{n}}.$

Comment: The graph of $y=x(1-x)$ is flat at $x=1/2$. What this means is that our variance, which is exactly $\frac{1}{4n}$ when $p=1/2$, does not shrink very much when $p$ is not far from $1/2$, like $p=0.4$. Thus our estimate for the variance, which is exact when $p=1/2$, is not too far off the truth even when $p$ is some distance from $1/2$.

In the context in which the question arose, $Y$ is almost certainly a sample proportion, so does not have normal distribution. But if $n$ is large, and $p$ is not too close to $0$ or $1$, the distribution of $Y$ is well-approximated by the normal.