1
$\begingroup$

My statistics test review question asks the following:

"An ad claims that more people prefer S coffee to P coffee. A random sample of 90 coffee drinkers is conducted and out of these 90, 48 of them like S better. However, the survey person concludes that the ad's claim is 'probably false'. How did the surveyor justify this? "

The choices all have to do with whether or not 47% is or is not in the Confidence interval or whether 50% is or is not in the Confidence Interval.

My first challenge with this test his how even to determine the Null Hypothesis. I thought that the null would be"status quo", which in this case would seem to be that people like S about the same as P, that Ho is $S=P$ and that the alternative would be Ha is $S>P$. In other words, I would think the null would be about the same number of people prefer S and P, and the alternative would be S>P.

However, if I'm trying to disprove a claim, don't I need to have the Null BE the claim? Does that work even in proportions?

Finally, I have read in my class text that I can't use a CI unless it is a two-sided test, and this is clearly a right-tailed test.

I would find the z value by using $z=phat-pnull/((\sqrt(pnull)\cdot(1-pnull)/90)$ and then evaluate if statistically significant (outside the CI).

Accordingly, NONE of the answers seem to make sense since I would be evaluating the z value NOT some ratio.

What am I missing about this question? Am I reading in too much?

1 Answers 1

0

The question is not clear, but I will show a one-sided hypothesis test and a couple of two-sided confidence intervals. None of these procedures supports the claim that more people prefer S than prefer P.

You have $n = 90$ subjects, presumably chosen at random from among coffee customers. Of these $X = 48$ prefer S. So the estimated proportion preferring S is $\hat p = 48/90 = 0.5333.$

One-sided test. To test the null hypothesis $H_0: p = .5$ against the alternative $H_a: p > .5,$ one can use the test statistic $$Z = \frac{\hat p - p_0}{\sqrt{p_0(1-p_0)/n}} = \frac{.5333 - .5}{\sqrt{.5(.5)/90}} = 0.6325.$$ One would reject $H_0$ if $Z > 1.645,$ which is not true. So the data are consistent with $H_0$ and the claim that $p > .5$ is not supported by the data.

Wald confidence interval. One kind of 95% confidence interval for $p$ is of the form $$\hat p \pm 1.96\sqrt{\frac{\hat p(1-\hat p)}{n}},$$ which computes to $0.5333 \pm 0.1031$ or $(0.4303, 0.6364).$ It contains $0.5,$ and so, again here, data are consistent with S and P being equally popular.

Agresti confidence interval. A somewhat more accurate 95% CI uses $\tilde p = (X+2)/(n+4)$ and $\tilde n = n+ 4.$ Using the formula $$\tilde p \pm 1.96\sqrt{\frac{\tilde p(1-\tilde p)}{\tilde n}},$$ it gives the 95% CI $(0.4310, 0.6328),$ which also includes $0.5$ and leads to the same interpretation as above.

One may quibble about whether to use a one- or two-sided test, or about the style of CI to use. But $n = 90$ observations are not nearly enough to settle which brand of coffee is more popular in the general population based on a sample that has the two brands so closely matched.

[Note: In order to distinguish meaningfully between $p = .533$ and $p = .50,$ a sample of more than 900 subjects would be required. The rule of thumb for that would be $n \approx 1/.033^2 > 900.$]

  • 0
    This is extremely helpful. However, I thought that the standard error computation used (phat)(1-phat), not psub zero.2017-03-04
  • 0
    For testing a hypothesis one uses the hypothetical value(s) of the probability(ies). For confidence intervals there are no hypothetical values, so estimates are used.2017-03-04
  • 0
    For example, in my original question, there's a phat value of 53%. To create a confidence interval around the phat value, I would think we would use the 0.53 value in the SE2017-03-04
  • 0
    Yes, for CIs, as I have shown in the second displayed relationship. It is precisely the inaccuracy of this estimated SE that motivates improved intervals such as Agresti's. In tests, the dist'n of the test statistic is governed by $H_0$ (that's why it's called the _null_ distribution), so there you use $p_0.$2017-03-04
  • 0
    So, the way that I finally saw how to answer the question, was to calculate the SE, using sqrt {(.54) (.46)/80}, =.06. Then I multiplied this by 1.64 (since it was a right tailed evaluation) to get 0.1. Now, I did 0.54 plus/minus 0.1 and could see that the result of 0.43 -> 0.66 DID contain the null hypothesis, so the null would NOT be rejected.2017-03-04
  • 0
    Did I do that correctly?2017-03-04
  • 0
    Sorry, did I miss an edit? 80 and .54 are new numbers to me. If you are mixing a one-sided test with a two-sided CI, you are courting confusion.2017-03-04
  • 0
    Let us [continue this discussion in chat](http://chat.stackexchange.com/rooms/54810/discussion-between-user163862-and-bruceet).2017-03-04
  • 0
    Was my error. It was 90. Do you have a moment to discuss in chat?2017-03-04
  • 0
    I understand now.2017-03-05