1
$\begingroup$

I want to estimate the pdf distribution of a polynomial transformation of a known continuous random variable $X$:

$Y = f(X) = a_nX^n + \cdots + a_1X + a_0$

Normally, the only numerical approach I know of is by using a simulation. What I tried to do:

evaluate discrete points the pdf of $Y$ by solving a root finding problem:

$P(a_nX^n + \cdots + a_1X + a_0 = c) = P(X = \alpha)$, where $\alpha$ is the root of $a_nX^n + \cdots + a_1X + a_0 - c = 0$.

I do this for varying $c$. I then evaluate these $\alpha$'s in the continuous distribution of the known $X$, and use the found points to fit a curve as approximation to the pdf of $Y$.

When I try this, the shape of the curve is right, however, it is not normalized.

I believe I am making an error by evaluating discrete points in a continuous distribution (which should be 0 by definition), but why is it that the shape of the curve is still right?

  • 0
    multiple solutions is not the main issue, and in my example there is only one solution. The point is that if X is uniform, your procedure would always result a uniform Y - and that's wrong.2011-12-02

1 Answers 1

0

Your problem is not really related to discretization, but with a bad understanding of changes of variables. It's true that the probability function of discrete variables gives you directly the probability of the events $P(X=x)$, and hence it's enough to find the inverse mapping; but for continuous variables things are a little more subtle. A continuous density (say, for the sake of simplicity a continuous random variable X which has besides a continuous density) gives you (informally) the probability that the variable falls inside a neighbourhood of $x$, multiplied by the width of that interval:

$f_X(x) \; \Delta x \approx P(x-\frac{\Delta x}{2} \leq X < x+\frac{\Delta x}{2}) \; $

(the approximation is to be understood as a limit).

Now, if you have a transformation $Y = g(X)$, to get the corresponding $f_Y(y)$, then, it's not enough to find out the inverse mapping, $x = g^{-1}(y)$ (analytically or numerically... and leaving aside temporarily the issue of multiple solutions) and then write $f_Y(y) = f_X(g^{-1}(y))$. That's basically what you've apparently done, and that is wrong, because the events "X equals $x$" cannot be assigned a probability. The correct way would be to put in correspondence the events "X falls in this neighborhood of $x$" and "Y falls in this neighborhood of $y$" and take the limit:

$ f_Y(y) \; \Delta y \approx f_X(x) \; \Delta x $

$ f_Y(y) \approx \frac{f_X(g^{-1}(y))}{\frac{\Delta y}{\Delta x}} $

If $g(X)$ is monotonous incresing, the denominator tends to g'(x)=g'(g^{-1}(y)) , and you get the standard formula for change of variables. If not, you have a little more work to do (as is your case) - but at least you get the idea.

  • 1
    Thank you for the explanation. However, I do not fully agree with you on ""X equals x" cannot be assigned a probability.": if we apply the definition of what a random variable is (a mapping between two measurable spaces), than that probability should be 0. Anyway, I will use the cdf, as that avoids the problem.2011-12-02