I have been told that an event occurs about once in 50 times. In my experience, it is more like once in 40 or 45, which would not be insignificant (if correct). However because this event does not happen very often I cannot be sure. What sample size would be sufficient for me to state that it's once in 40 , ofr example ?
What sample size do I need to justify my suspicions?
-
0Yes. Let's say I need 95% confidence, if that helps. – 2012-11-27
1 Answers
Are the events (assumed to be) independent? Do you expect each event to occur with the same probability in every trial? If so, you can set $X$ = # of events in $n$ trials, $n$ = #of trials, and $p$ = probability of the event occurring. Under the above assumptions, $X\sim Binomial(n,p)$ and if both $np>5$ and $n(1-p)>5$ we can apply the normal approximation to the binomial distribution Hogg, et. al. (2005), page 222. The distribution of $X$ is approximately
$ X\sim N(np,\sqrt{np(1-p)}) $
or equivalently for the sample proportion $\hat{p}=X/n$
$ \hat{p}\sim N(p, \sqrt{p(1-p)/n}) $
We can estimate the standard deviation of $\hat{p}$ as $s.e.\{\hat{p}\}=\sqrt{\hat{p}(1-\hat{p})/n}$ and then an approximate 95% confidence interval for $p$ would be
$ \hat{p}\pm z_{0.975}\cdot \sqrt{\hat{p}(1-\hat{p})/n} $
where $Z_{0.975}$ is the 97.5% quantile of the standard normal distribution. Using calculus, we can bound $s.e.\{\hat{p}\}$ by $\frac{1}{\sqrt{4n}}$. Then for any desired margin of error $M$ we need,
$ z_{0.975}\cdot \frac{1}{\sqrt{4n}} < M $
or
$ n>(\frac{z_{0.975}}{2M})^2 $
If in your case, you set the $M=p_0-p_1=\frac{1}{40}-\frac{1}{50}$, you could be certain that $p_0=\frac{1}{50}$ would not be in a 95% CI if you are correct that $p_1=\frac{1}{40}$ is the true probability. This would be equivalent to doing a two sided hypothesis test where $H_0: p=\frac{1}{50}$ and $H_1: p> \frac{1}{50}$. Note that the while large enough $n$ ensures that the margin of error is small, whether you can reject $H_0$ will depend on the true probability $p$ which is unknown.
$ n > \Bigg(\frac{z_{0.975} \sqrt{\frac{24}{625}}}{M}\Bigg)^2 = \frac{0.147512}{M^2} $
which should be much smaller. As long as the real probability is less than $a$ this will hold. Keep in mind that you still need $np>5$ and $n(1-p)>5$ for the approximation to be correct.
-
0Yes, that would be the alternative hypothesis in your case. – 2012-11-28