0
$\begingroup$

We have data come from a normally distributed population with standard deviation $2.6$, what sample size is needed to make sure that with $99\%$ probability, the mean of the sample will be in error by at most $0.25$?

  • 0
    Are you assuming the population mean of the normally distributed population is 0? "the mean of the sample will be in error by at most 0.25?" Are you asking if P \left(\left.\left\vert \bar{x} - \mu \right\vert < .25\right\vert \text{sample size }n\right) = .99?2012-12-16

1 Answers 1

1

It seems you are talking about the margin of error for or of a confidence interval for the mean. The general formula is

$\bar{X} \pm z^* \cdot \frac{\sigma}{\sqrt{n}} $

Where $\bar{X}$ is the sample mean, $\sigma$ is the population standard deviation, $z^*$ is the critical value from the standard normal distribution and depends on the size of the confidence interval, and $n$ is the sample size.

We want the term on the right, the margin of error (denoted $ME$ below) to be less than or equal to a fixed size. Thsu we need

$z^* \cdot \frac{\sigma}{\sqrt{n}} \leq ME$

Rearranging to solve for $n$ gives

$n \geq \left( \frac{z^* \sigma}{ME} \right)^2$

The critical value for a 99% confidence interval is $z^* = 2.5758$ (the 99.5th percentile of a Standard Normal distribution to 4 decimal places). Then using your values of $\sigma = 2.6$ and $ME = 0.25$ gives

$n \geq \left( \frac{(2.5758) (2.6)}{0.25} \right)^2 \approx 717.6$

Thus $n = 718$ will do as we wish. The way you stated the problem seemed to suggest that $2.6$ was the population standard deviation. If it was actually the sample standard deviation then the formula is a bit different. In that case the confidence interval is given by

$\bar{X} \pm t_{df}^* \cdot \frac{s}{\sqrt{n}} $.

where the degrees of freedom, $df = n - 1$. The main issue here is that the critical $t$ value depends not only on the confidence level but also on the sample size. In any case, for a given confidence level, $t_{df}^* > z^*$ so the value we used $z^* = 2.5758$ will underestimate our sample size but not by much. If we use $n = 718$ as an estimate for the sample size then $df = 718 - 1 = 717$ and then for a 99% confidence interval $t_{717}^* = 2.5827$ which then gives $n \geq 722$ as our sample size and another iteration has no effect on the result.

The other issue is that $s$ also depends on sample size. This should just result in a better estimate for $\sigma$, the population standard deviation, but it can still affect the sample size to achieve the desired $ME$. In general, except in the case of very small samples, neither of these sources of inaccuracy will be significant. Use the sample size used to get $s$ to determine the critical $t$ value and your estimate for required sample size should at least be large enough as the value of $t_{df}^*$ decreases as $n$ and thus $df$ increases.