6
$\begingroup$

This question is prompted by a recent discussion (Conditional Expectation a Decreasing Function Implies Covariance is nonpositive) about the relationship between conditional expectation and covariance.

Suppose that $X$ and $Y$ are zero-mean unit-variance random variables with covariance (and correlation coefficient) $\rho$. The minimum-mean-square error (MMSE) estimator of $Y$ given $X$ is the random variable $g(X)$ that minimizes $E[(Y-g(X))^2]$, and as is well known, $g(X) = E[Y \mid X] ~\text{minimizes}~E[(Y-g(X))^2]$ It is also well known that $E[g(X)] = E[E[Y\mid X]] = E[Y] = 0$. In general, $g(X)$ is a nonlinear function. On the other hand, if the estimator is restricted to being of the form $\hat{Y} = aX + b$ where $a$ and $b$ are real numbers, then the linear MMSE estimator of $Y$ given $X$ is $\hat{Y} = \rho X$, that is, $a = \rho, ~ b = 0, ~\text{minimizes}~E[(Y-aX-b)^2].$ The linear MMSE estimator $\rho X$ has a mean-square-error $E[(Y-\rho X)^2] = 1 - \rho^2$ and so the mean-square-error of the MMSE estimator $g(X)$ can be no larger:
$E[(Y-g(X))^2] \leq 1 - \rho^2.$

A simplified version of the question in the previous discussion is: if $g(\cdot)$ is a decreasing function of its argument, show that $\rho$ is nonpositive.

My question is: what is the linear MMSE estimate of $g(X) = E[Y \mid X]$ given $X$? That is, what choice of real numbers $c$ and $d$ minimizes $E[(g(X) - cX - d)^2]$? Since $g(X)$ and $X$ both have zero mean and $X$ has unit variance, standard linear MMSE estimator theory gives that $d = 0$ and $c = \frac{\text{cov}(g(X),X)}{\text{var}(X)} = \text{cov}(g(X),X) = E[Xg(X)]$ which I think might work out to be $\rho$, but I am not sure about this. Any suggestions on how to proceed further would be appreciated.

  • 0
    @MichaelHardy I thought about editing the question to say something like "straight-line MMSE estimation" instead of "linear MMSE estimation" but decided against it because linear MMSE estimation is reasonably well-established, at least in the engineering literature: Google provides over $900,000$ hits. But, thanks for your answer which I am accepting. I was able to show $E[Xg(X)] = \rho$ for discrete and for jointly continuous random variables but wanted a proof that did not rely on special cases, and your answer gave me exactly what I wanted.2011-12-04

1 Answers 1

8

Your conjecture is correct. By the law of total expectation we have $ \begin{align} E(X(Y-g(X)) & = E(\;E(X(Y-g(X))\mid X)\;) \\ \\ & = E(\; E(XY\mid X) - E(Xg(X)\mid X)\;) \\ \\ & = E(\; XE(Y\mid X) - Xg(X) \;) \\ \\ & = E( Xg(X) - Xg(X)) = 0. \end{align} $ Therefore $ E(XY) = E(Xg(X)). $