Let $y$ represent observations and let $x$ be the latent variable of interest, then we have the following equivalent statements that can be helpful in proving that $t=t(y)$ is sufficient statistic wrt to our model $p_y(y;x)$
Non-Bayesian Viewpoint
- Definition: $p_{y|t}(y|t; x)$ is not a function of x
- Proportional likelihoods (same ML estimate): $L_y(x) \propto L_t(x)$
- Neyman factorization theorem: $p_y(y;x) = a(t(y),x)\times b(y)$
Bayesian Viewpoint
- Conditioning on $t$ removes dependence on $x$: $p_{y|t,x}(y|t(y),x)
= p_{y|t}(y|t(y))$
- Same posteriors: $p_{x|y}(x|y) = p_{x|t}(x|t(y))$
- Data processing inequality $I(X;Y) \geq I(X;Z)$ for a Markov Chain
$X\rightarrow Y\rightarrow Z$ is satisfied with equality: $I(X;Y) =
I(X;T(Y))$ if $Z = T(Y)$, i.e. sufficient statistic of data $Y$.
For example for an exponential family
$p_y(y;x) = \exp \{\lambda(x) t(y) + \beta(y) - \alpha(x)\}$
where $\lambda(x)$ are natural parameters, $t(y)$ is sufficient statistic, $\beta(y)$ is log-base function and $\alpha(x)$ is log-normalizer, we can show that $t(y)$ is sufficient statistic by applying Neyman factorization theorem:
$p_y(y;x) = \exp \{\lambda(x) t(y) - \alpha(x)\} \times \exp\{\beta(y)\} = a(t(y), x) \times b(y)$