3
$\begingroup$

Suppose $(\Omega, \mathcal{F}, P)$ is a probability space and $(U, \mathcal{\Sigma})$ is a measurable space. There seem to be two ways of defining the conditional expectation of a r.v. $X: \Omega \rightarrow \mathbb{R}$ given another r.v. $Y: \Omega \rightarrow U$, denoted as $E(X\vert Y)$:

  1. As a $\sigma(Y)$-measurable mapping from $\Omega$ to $\mathbb{R}$, defined as: $E(X\vert Y) = E(X \vert \sigma(Y)). $ where $\sigma(Y)$ is the sigma algebra of r.v. $Y$, which I think is also denoted as $Y^{-1}(\mathcal{\Sigma})$?
  2. As a $\mathcal{\Sigma}$-measurable mapping from $U$ to $\mathbb{R}$, defined as follows (from Wikipedia):

    Define measure Q on U to be the probability measure induced by $Y$ on $(U, \mathcal{\Sigma})$, as $Q(B) = P(Y^{−1}(B)), \forall B \in \mathcal{\Sigma}$.

    Define $E(X \vert Y)$ to be the integrable function $g:U \rightarrow \mathbb{R}$ such that

    $ \int_{Y^{-1}(B)} X(\omega) \ d \operatorname{P} = \int_{B} g(u) \ d \operatorname{Q}, \forall B \in \mathcal{\Sigma}.$

    If I am correct, this definition is related to the first one as: $E(X \mid Y) \circ Y= E\left(X \mid Y^{-1} \left(\Sigma\right)\right). $

I was wondering which of the above two is the definition of $E(X \mid Y)$?

Thanks and regards! References (links or books) will also be appreciated!

  • 0
    $g$ is the measurable mapping from $(U,\Sigma)$ to $\mathbb{R}$, which means that it is a random variable, only its domain is a probability space $(U,\Sigma,P\circ Y^{-1})$. Since the measure on this probability space is induced by $Y$, distributional properties of $g$ and $E(X|\sigma(Y))$ will be the same. So strictly speaking these definitions define two different mathematical objects, but they have similar properties.2011-02-25

1 Answers 1

2

I've only seen $E[X|Y]$ used to denote the first one. The second is a function $g$ such that $g(Y) = E[X|\sigma(Y)]$. (by the way, $g$ is only unique up to $Q$-null sets). Informally, you might write $g(y) = E[X|Y=y]$, which is actually correct when $Y$ is discrete (i.e. when $Q$ is atomic).

By definition, $E[X|\sigma(Y)]$ is a $\sigma(Y)$-measurable random variable that is supposed to answer the following: if I told you the value of $Y$, what would be your best estimate of the value of $X$? Your estimate would be different depending on the value of $Y$, so it should be some function $g$ of $Y$. Your second definition is referring to that function $g$.

If you want to prove that in the discrete case $g(y) = E[X|Y=y]$, call the right side $h(y)$. Then show that the random variable $h(Y)$ satisfies the conditions that uniquely define $E[X|\sigma(Y)]$: namely, $h(Y)$ is $\sigma(Y)$-measurable and for any $A \in \sigma(Y)$, $E[h(Y);A] = E[X;A]$. Note that in the discrete case, $A$ is necessarily a countable union of events of the form $\{Y=y_i\}$.

  • 0
    (1) I meant $E[X|Y]$ in sense 1. I've clarified. (2) I added a hint. I don't have any references for you offhand, sorry.2011-02-25