5
$\begingroup$

Given measure spaces $(X, \mathcal{X})$ and $(Y, \mathcal{Y})$ we define measure kernel $\pi : \mathcal{X} \times Y \to [0,\infty]$ such that $\pi(\cdot|y)$ is a measure on $\mathcal{X}$ for every $y \in Y$ and $\pi(A|\cdot)$ is $\mathcal{Y}$-measurable for every $A \in \mathcal{X}$. Probability kernel is then a measure kernel with $\pi(X|y) = 1$ for every $y \in Y$.

Now, this definition is a little abstruse, so I'd like to gain some intuition about it. I can see similarities with usual kernels of operators (because measure is a generalization of a function, at least naively). Also, I can think of the kernel as a collection of measures indexed by $y \in Y$. But I am not sure which of these two views (or if any) gives much insight into why this concept is natural and useful.

  1. What is the intuition behind the probability kernels?
  2. What are some applications that show their usefulness?

To be more precise about what I am after. There is a definition of group action $\rho$ of group $G$ on set $M$ as $\rho : G \times M \to M$ satisfying certain axioms. But this doesn't really give me any insight. If however someone told me that group action is actually nothing else than homomorphism $\rho : G \to {\rm Aut}(M)$ then I can immediately see the usefulness (given that I know enough group theory, of course). Is there something similar behind probability kernels too?

  • 0
    @Didier: very illuminating, thank you very much!2011-04-17

1 Answers 1

7

Here is a quote (page 20) from one of my favorite books on probability, Foundations of Modern Probability (2nd edition) by Olav Kallenberg.

"Kernels play an important role in probability, where they may appear in the guises of random measures, conditional distributions, Markov transition functions, and potentials."

In my answer to this question, I try to explain how kernels are used to build a stochastic process, that is, to describe a random system that evolves in time. You begin with an initial distribution for $X(0)$, then a conditional distribution that tells you how $X(1)$ will behave given $X(0)$, then a conditional distribution that tells you how $X(2)$ will behave given the pair $X(0),X(1)$, etc.

This is only one application of the idea, of course.

  • 0
    @Marek I'm glad it helps.2011-04-17