Suppose we have some data $\{data\}$ arising from one distribution described by an unknown set of parameters $X_0$ that we want to estimate. The maximum likelihood procedure (under a uniform prior) provides us with the following estimator:
$\hat{MLE} = argmax_Y \ \ L(\{data\}| Y)$
which we can regard as a random variable, given that the data are random. I try to indicate random variables with an hat. Also the likelihood itself can be regarded as a random variable (parametrically dependent on Y, if we want: a random density), $\hat{L}(\{data\}|Y)$ and it seems possible to define as well:
$\hat{\mu}=\int \ Y \hat{L}(\{data\}|Y) \ dY$
and
$\hat{\sigma^2}=\int \ (Y -\hat{\mu})^2 \ \hat{L}(\{data\}|Y) \ dY$
How are these random variables related? Do we have to expect that generally (maybe not always...):
- $E[\hat{MLE}] = X_0$ ?
- $E[\hat{\sigma^2}] = \sigma^2[\hat{MLE}]$ ?
- $E[\hat{\mu}] = X_0$ ?
I know that point (1) is not always true (as the mle estimator for the variance has a factor of 1/N instead of 1/(N-1) ), but the others? And "how much" point (1) is false? I guess something similar must be true...