4
$\begingroup$

I am trying to understand the definition of a sufficient statistic and trying to make conceptual sense of it. Wikipedia says $Pr(X=x|T(X)=t,\theta) = Pr(X=x|T(X)=t)$ Exactly how am I suppose to make sense of probability with $\theta$? Probability makes sense for $X$ because it is a function (a random variable) defined on a probability space. As far as I can tell $\theta$ has no sense of being defined on that probability space so how can a conditional make sense?

3 Answers 3

7

In statisitcal decision theory, you don't work with a single probability space. You have a measurable sample space $(\Omega,\Sigma)$ and a whole family of probability measures, $(\mu_\theta)$, indexed by $\theta$ on $(\Omega,\Sigma)$. In statistics, you don't know the underlying probabilities- you usually want to learn them.

Now the statistician has no direct information about $\theta$, but observes the distribution $\mu_\theta X^{-1}$ of a random variable $X$ that provides the data. Suppose now, that you have a statistic $T$, a measurable function of $X$. The statistic is sufficient if someone observing the distribution of $T\circ X$ learns no more about $\theta$ when observing the distribution of $X$ directly. So the distribution of $X$ conditional $T$, which you can calculate for each $\theta$ separately, doesn't show you anything about $\theta$, it is independent of $\theta$. There are different formulations of sufficiency, depending on different formulations of conditional probability.

The classic paper that formalized sufficient statistics within measure theoretic probability is Application of the Radon-Nikodym Theorem to the Theory of Sufficient Statistics by Halmos and Savage.

  • 0
    So is the viewpoint $Pr(X_\theta=x|T(X_\theta)=t) = Pr(X=x|T(X)=t)$ then correct?2012-02-24
1

Perhaps a better way of thinking about it:

A statistic is sufficient for $\theta$ if the conditional distribution of X given T does not depend on $ \theta $.

You're right in that $\theta$ is not a random variable.

0

It's not the probability of $\theta$ being in any particular set; it's a probability of some event given $\theta$.

Here's a simple example: $p\cdot100\%$ of all voters in a large population will vote "yes" in next week's referendum. A poll is based on a random sample of $n$ voters. Let X_i = \begin{cases} 1 & \text{if the }i\text{th voter will vote "yes"}, \\ 0 & \text{if not}. \end{cases}

Then the conditional probability distribution of $X_1,\ldots,X_n$ given $p$ and given the value of the sum $X_1+\cdots+X_n$ does not depend on $p$. Find the conditional probability and you'll find that $p$ cancels out. Therefore that sum is a sufficient statistic for $p$.