I don't understand why people accept certain formulas in statistics without a mathematical proof style argument. You see this a lot in statistics textbooks and unfortunately this spills over with the instructors who are themselves ignorant of where the formulas come from yet teach them anyway.
How come in statistics there is very little justification for the formulas used and proofs are almost nonexistent
-
0Ahh, wonderful option pricing... – 2012-05-08
3 Answers
This isn't really a question about statistics; it's a question about the way in which many statistics courses are taught. Many statistics courses are designed specifically for students in particular fields (psychology, anthropology, etc.) and are taught by people working in those fields. These students generally have no background in theoretical mathematics; in many cases the students haven't even had calculus. Attempting to include any substantial amount of rigorous mathematics in such a course would be pointless. As a result, such courses are typically methods cookbooks, and the best that one can hope for is that a serious attempt will be made to give students a good basic idea of when the standard statistical methods are appropriate and of how statistics are used in the field in question.
It's also not uncommon for mathematics departments to offer basic statistics courses for students with no mathematical background beyond high school algebra. In my experience these tend to dwell less on an array of statistical methods than the discipline-specific courses and to spend more time on the underlying concepts, but any sort of rigorous theory would still be completely out of place.
Once you get to the statistics courses intended for mathematics and statistics majors, you can in my experience expect to see some real theory, as well as some of the heuristics that have proved useful but for which no rigorous theoretical justification exists (yet).
For most of the statistics encountered outside of theoretical statistics classes, either the justification is too advanced to be worth providing in that context, or there is only partial mathematical justification available. Is it really important to analytically derive the $t$-distribution in order to use it, or to justify the $\chi^2$ test for contingency tables, or the assumptions needed for the logistic regression model on categorical data? Probably not in a first or second class that a typical user of statistics might take.
Proofs are not a bad thing, but there is a considerable amount of experience, intuition and rules of thumb accumulated in statistics that are as valuable as proofs and theoretical models when using the methods in practice. Time and textbook space are limited so there is a choice of priorities.
As the satirical comment under the question indicates, if you are not the typical or casual user, then proofs, or some generally higher standard of understanding the details, become more important.
The simplest answer I can think of:
Formal justification of statistical methods is not that important for the reason that most real-life statistical analysis occurs in contexts where the results are not intended to be interpreted formally. It barely matters if an analyst doesn't know what a test "really" means, if his or her audience doesn't either.