10
$\begingroup$

I've been going back over my notes from Stats class and came across the Probability Integral Transform. From my limited understanding, the basic idea is that from a cdf in terms of one variable, can be transformed into another cdf in terms of different variable:

  • i.e. from $F_x(x)$ to --> $F_y(y)$

Is this understanding correct? What is the purpose behind this? Finally, is there a general procedure in performing the transformation?

1 Answers 1

10

Your understanding looks basically correct to me.

As far as purpose, I've seen it used mostly to generate random variables from continuous distributions. For instance, if $X$ has a $U(0,1)$ distribution, then $F_X(x) = x$. Thus the requirement $F_X(x) = F_Y(y)$ in the probability integral transform reduces to $x = F_Y(y)$ or $y = F_Y^{-1}(x)$. Since $y$ is an observation from the probability distribution $Y$, this means that we can generate observations from the distribution $Y$ by generating $U(0,1)$ random variables (which most software programs can do easily) and applying the $F_Y^{-1}$ transformation.

For example, suppose you want to generate instances of an exponential$(\lambda)$ random variable. The cdf is $$F(y) = \int_0^y \lambda e^{-\lambda t} dt = 1 - e^{-\lambda y}.$$ Solving for $y$, we have $$F(y) - 1 = - e^{-\lambda y} \Rightarrow -\lambda y = \ln (1- F(y)) \Rightarrow y = F^{-1}(x) = -\ln(1-x)/\lambda.$$

Thus if $x$ is an observation from a $U(0,1)$ distribution, then $y = -\ln(1-x)/\lambda$ is an observation from an exponential$(\lambda)$ distribution. Moreover, $x$ having a $U(0,1)$ distribution is equivalent to $1-x$ having a $U(0,1)$ distribution, so we often express the transformation as $y = -\ln x/\lambda$.

As far as a general procedure for performing the transformation, what I've done here with the uniform and exponential distributions should give you a guide. Unfortunately, though, there aren't that many commonly-used distributions for which the cdf can be inverted analytically.

  • 1
    "Unfortunately, though, there aren't that many commonly-used distributions for which the cdf can be inverted analytically." - indeed. :(2011-04-20
  • 0
    Mike, I have a question. So when you say we want to generate instances of an exponential random variable, you are saying our objective is to obtain a realization from random experiment whose distribution is exponential, correct? So the value of the Probability Integral Transform is that if we have the means of generating realizations from the standard uniform distribution, we can easily transform this (like you did above by solving for y) and get realizations from exponential distribution, correct?2016-12-21
  • 0
    My understanding is that if you have a column of, say 1,000 standard uniform random variable realization entries in Excel, and plug in the -lnx/lambda formula to the next column and plot these realizations, we will get pretty close to exponential density histogram?2016-12-21
  • 0
    @FrankSwanton: That's exactly right.2017-01-03