I'm just trying to get my head around the differences between these three approaches to statistical inference. I'm just not entirely sure what the significant differences are between the three.
The difference between the Frequentist, Bayesian and Fisherian appraoches to statistical inference
-
1May be better for https://stats.stackexchange.com/ – 2017-01-01
-
1I usually think of the ranking 'third option' as Likelihoodist rather than Fisher. I agree with BruceET that it's hard to pin down a coherent account of Fisher's approach, and that he was frequentist with strong objections to Neyman-Pearson. My (incomplete) understanding on his 'fiducial' approach is that he liked confidence intervals only if the test statistic was sufficient, or there were ancillary statistics to condition on so that the CI uses all the info in the data. He had an interpretation for the probability distribution implied by the CI levels as a but I never really understood it. – 2017-01-02
1 Answers
This is an enormously broad topic, which is also the subject of acrimonious debate. I will try to give some partial answers that I regard to be basic and uncontroversial. Even so, I would not be surprised to see comments saying I have missed the whole point.
Frequentist. The name comes for the Law of Large Numbers.
The probability that a die shows 1 is interpreted as the a
ratio arising from a very large number of repetitions of an
experiment. Roll the die a large number $n$ of times. Then
$$\theta = P(\text{Face 1}) \approx \frac{X}{n},$$ where $X$ is the number of 1's seen among the $n$ rolls.
One style of 95% 'confidence interval' (based on a normal approximation to the binomial distribution) for the probability $\theta$ is of the form
$$\tilde \theta \pm 1.96\sqrt{\frac{\tilde \theta(1 - \tilde \theta)}{n}}.$$
where $\tilde n = n+ 4,$ and $\tilde \theta = (X+2)/\tilde n$. For example, if $X = 170$ and $n = 1000,$ then such a 95% CI would be $(0.148, 0.195).$
Strictly speaking, this is viewed as a statement about the data $X,$ rather than a statement about $\theta.$ The idea is that in a large number of $n$-roll experiments, one expects the unknown true value of $\theta$ to be included in ('covered by') the CI 95% of the time.
If I am a practicing statistician, I cannot tell my client that there is "95% probability" the true proportion $\theta$ of ones from the die is between 0.148 and 0.195. The unknown numerical value of $\theta$ either lies in the interval or it does not. I can tell my client that the "procedure" I used to get the interval will give "correct" intervals about 95% of the time across my years of practice.
Bayesian. The probability $\theta$ that a die shows 1 is interpreted as a personal
probability. Such a probability may be based on a hunch; a willingness to pay $6$ in a bet where one stands to gain $1;$ the apparent symmetry of the die;
or limited prior experience rolling the die.
More specifically, $\theta$ is viewed, not as an unknown constant, but as a random variable with a 'prior distribution'. If I really have a very diffuse opinion about $\theta,$ I might use an 'uninformative' prior distribution such as $Unif(0,1) = Beta(1,1).$ If I have much stronger feelings I might use a prior distribution such as $Beta(100,500),$ which has mean $1/6$ and SD about $0.015.$
A Bayesian 95% 'probability interval' or 'credible interval' estimate of $\theta$ based on $X$ ones in $n = 1000$ rolls of the die would be based on quantiles .025 and .975 of 'posterior' distribution $Beta(\alpha_0 + X, \beta_0 + n - X),$ where $\alpha_0$ and $\beta_0$ are the parameters of the beta prior distribution. If $X = 170$ and $n = 1000,$ then the 95% Bayesian probability interval based on the uninformative prior distribution is $(0.148, 0.195)$. This interval can be viewed as a statement about the posterior distribution of the random variable $\theta.$ [The computation in R statistical software is shown below.]
qbeta(c(.025, .975), 171, 831)
## 0.1480099 0.1945525
By contrast, if the prior $Beta(10,50)$ is used, then the interval is a little narrower: $(0.150, 0.187).$ The empirical information in the data and the subjective information in the prior have been combined to give this probability interval. I can tell my client, "If you believe the prior distribution you gave me and the integrity of your data, then you should believe the interval I'm giving you."
Fisher. My observation is that there seem to be about as many opinions on Fisher's 'fiducial' method as there are readers of his accounts. One major controversy involved the dispute between Fisher on the one hand and Neyman and Pearson on the other. Fisher was happy to talk about the significance level of a hypothesis test, but not the power. I will leave it for those better informed or more passionate about these controversies than I to take it from there.
-
0Brilliant answer! Thank you! Has made things a lot clearer. – 2017-01-02