We have data come from a normally distributed population with standard deviation $2.6$, what sample size is needed to make sure that with $99\%$ probability, the mean of the sample will be in error by at most $0.25$?
What sample size is needed to make sure that with 99% probability, the mean of the sample will be in error by at most 0.25
-
0Are you assuming the population mean of the normally distributed population is 0? "the mean of the sample will be in error by at most 0.25?" Are you asking if P \left(\left.\left\vert \bar{x} - \mu \right\vert < .25\right\vert \text{sample size }n\right) = .99? – 2012-12-16
1 Answers
It seems you are talking about the margin of error for or of a confidence interval for the mean. The general formula is
$\bar{X} \pm z^* \cdot \frac{\sigma}{\sqrt{n}} $
Where $\bar{X}$ is the sample mean, $\sigma$ is the population standard deviation, $z^*$ is the critical value from the standard normal distribution and depends on the size of the confidence interval, and $n$ is the sample size.
We want the term on the right, the margin of error (denoted $ME$ below) to be less than or equal to a fixed size. Thsu we need
$z^* \cdot \frac{\sigma}{\sqrt{n}} \leq ME$
Rearranging to solve for $n$ gives
$n \geq \left( \frac{z^* \sigma}{ME} \right)^2$
The critical value for a 99% confidence interval is $z^* = 2.5758$ (the 99.5th percentile of a Standard Normal distribution to 4 decimal places). Then using your values of $\sigma = 2.6$ and $ME = 0.25$ gives
$n \geq \left( \frac{(2.5758) (2.6)}{0.25} \right)^2 \approx 717.6$
Thus $n = 718$ will do as we wish. The way you stated the problem seemed to suggest that $2.6$ was the population standard deviation. If it was actually the sample standard deviation then the formula is a bit different. In that case the confidence interval is given by
$\bar{X} \pm t_{df}^* \cdot \frac{s}{\sqrt{n}} $.
where the degrees of freedom, $df = n - 1$. The main issue here is that the critical $t$ value depends not only on the confidence level but also on the sample size. In any case, for a given confidence level, $t_{df}^* > z^*$ so the value we used $z^* = 2.5758$ will underestimate our sample size but not by much. If we use $n = 718$ as an estimate for the sample size then $df = 718 - 1 = 717$ and then for a 99% confidence interval $t_{717}^* = 2.5827$ which then gives $n \geq 722$ as our sample size and another iteration has no effect on the result.
The other issue is that $s$ also depends on sample size. This should just result in a better estimate for $\sigma$, the population standard deviation, but it can still affect the sample size to achieve the desired $ME$. In general, except in the case of very small samples, neither of these sources of inaccuracy will be significant. Use the sample size used to get $s$ to determine the critical $t$ value and your estimate for required sample size should at least be large enough as the value of $t_{df}^*$ decreases as $n$ and thus $df$ increases.