0
$\begingroup$

I have a continuous random variable $X$ (positive). I want to simulate its distribution with a discrete distribution and calculate $E[X]$ from that discrete distribution. So, the obvious approach is to divide the range of the random variable into step size of $h$. Let the CDF values at the points $0, h, 2h, \ldots$ be $P_0, P_1, P_2, \ldots$.

So, $\mathrm{Prob}(0 < X \leq h) =P_1-P_0$, $\mathrm{Prob}(h < X \leq 2h) = P_2-P_1$; and so on.

Now these probability masses are associated with a interval. We need to find a representative point of each interval. For an interval $(a,b]$ which point should we take as the representative point leftmost point, rightmost point or the mid point to get a good estimate of $E[X]$. This is my question.

  • 0
    No, I don't have access to the density, I only have access to the CDF values at discrete points2011-10-28

1 Answers 1

1

It's not very clear if to "calculate $E[X]$" means to estimate it from an estimation of the density (which is in turn approximated by a discrete density?). Anyway, mere intuition tells that if we discretized uniformly the variable domain, and if no more knowledge is assumed, the "representative point" should be taken as the middle point (not only for the computation of the mean). In this case, this strategy could be justified by comparing the true mean $\mu_x = \int x f_x(x) dx = \sum_k \int_{k h}^{(k+1)h} x f_x(x) dx$

with the "discretized mean" \mu'_x = \sum_k x_k p_k

where $x_k$ is the "representative point", and we assume that $p_k$, the discrete probability function, equals the probability that $x$ fall in the respective interval $p_k = \int_{k h}^{(k+1)h} f_x(x) dx$. Then, as we want that \mu'_x \to \mu_x, we'd want

$\int_a^b (x_k-x)f_x(x) dx $

to be zero inside each interval. If we assume that the interval is small, so that we can approximate $f_x(x)$ by a constant, we get $x_k=(a+b)/2$ , i.e., the interval midpoint. More in general, the optimal $x_k$ is given by the conditional expectation of $x$, conditioned that $x$ falls inside the respective interval.

  • 0
    What about starting with some step size h and then h/2,h/4 etc until we cross the error bound2011-10-29