3
$\begingroup$

In here, stated as a theorem is:

Let $\mathcal{G}\subseteq\mathcal{F}$ be $\sigma$-algebras, $X\in L_2$ be a random variable on the probability space. Then, there exists a random variable $Y\in\mathcal{G}$ such that

$E||X-Y||_2=\inf_{Z\in\mathcal{G}}E||X-Z||_2$

How do I prove such an element exists up to almost surely equivalence? Also, how can I show that this is $E[X|\mathcal{G}]$. In this appendix, $E[X|\mathcal{G}]$ is defined as this, but what if we start with the usual definition that this conditional expectation satisfies the properties that for all $G\in\mathcal{G}$, $\int_{G} E[X|\mathcal{G}]dP=\int_G XdP$ and $E[X|\mathcal{G}]$ is $\mathcal{G}$-measurable and integrable?

1 Answers 1

2

If we assume that we are minimizing over all $\mathcal{G}$-measurable functions in $L^2$, we can proceed as follows:

First, we can use Jensen's inequality for conditional expectation to say that $$|E[X|\mathcal{G}]|^2 \leq E[|X|^2 |\mathcal{G}] = E[|X|^2 | \mathcal{G}] < \infty.$$

Then, since $\Omega \in \mathcal{G}$, if $Z$ is any $\mathcal{G}$-measurable function and $Y=E[X|\mathcal{G}]$, then $$E[(X-Y)Z] = E[XZ-YZ] = E[XZ] - E[YZ \chi_{\Omega}] = E[XZ] - E[XZ\chi_{\Omega}] = 0$$ Thus, $X-Y$ is orthogonal to any $\mathcal{G}$-measurable random variable $Z$, so $Y$ minimizes that $L^2$ norm over all possible random variables $Z$.

Here $\chi_A$ is the indicator function of the set $A$, and I'm using the fact that $E[Z E[X|\mathcal{G}]] = E[E[ZX|\mathcal{G}]] = E[ZX]$, since $Z$ is $\mathcal{G}$-measurable.

This property that $E[(X-Y)Z] = 0$ for every $\mathcal{G}$-measurable $Z$ is sufficient because given any $Z \in \mathcal{G}$, $$E[(X-Z)(X-Z)] = E[(X-(Y+Z-Y))(X-(Y+Z-Y))] $$ $$= E[(X-Y)(X-Y)] + 2 E[(X-Y)Z] + E[(Z-Y)(Z-Y)]$$ so we can see that this last expression is minimized when $Z \equiv Y$.

  • 0
    I think you want to assume that $Z$ is $L^2$. Also, why is $Y \in L^2$? Your argument shows that *if* $Y \in L^2$ then it minimizes the distance of $X$ to $L^2(\mathcal{G})$2011-12-11
  • 0
    @t.b.: You're right. I am assuming that he is minimizing over all $\mathcal{G}$-measurable functions in $L^2$. Thanks. For your second point, $Y \in L^2$ because $X \in L^2$. One way of seeing this is by using a version of Jensen's inequality for conditional expectation: $|Y|^2 \leq E[|X|^2|\mathcal{G}] = E[|X|^2] < \infty$2011-12-11
  • 0
    Thanks for that. Maybe you want to add that to your answer, and I'll upvote then (if you could add a short remark on a.e. uniqueness in the projection version also raised in the question that would be even better :) ).2011-12-11
  • 0
    Can you mathematically expand on the step where $X-Y$ being orthogonal to any $\mathcal{G}$-measurable $Z$ leads to $Y$ minimizing the $L_2$ norm?2011-12-11
  • 0
    @sam.t: Are you familiar with projections? The theorem that you mention can be understood as saying that the conditional expectation of $X$ with respect to $\mathcal{G}$ is the orthogonal projection of $X$ onto the set of $L^2$ functions that are $\mathcal{G}$-measurable. One way of understanding the relationship to the minimization property is to say that if $E[(x-Y)Z] = 0$, then $E[(x-Z)(x-Z)] = E[(X-(Y+Z-Y))(X-(Y+Z-Y))] = E[(X-Y)(X-Y)]+E[(Z-Y)(Z-Y)]$, which is minimized when $Z-Y \equiv 0$.2011-12-11
  • 0
    @t.b.: Can you elaborate on what you mean by remarking on the a.e. uniqueness in the projection? I thought that conditional expectation is usually only defined up to a.e. equivalence classes.2011-12-11
  • 0
    That's what I meant, sorry for being unclear. In the OP one definition was given as orthogonal projection (which is unique by Hilbert space geometry) and I read the last part as definition via Radon-Nikodym. Since OP is happy I don't insist on elaborating. Thanks for the effort and +1.2011-12-11