4
$\begingroup$

I have a question regarding the probability distribution of the difference of two random variables.

Suppose we have $n$ exponential random variables, and they are all independent and identically distributed with parameter $\lambda$. Then, suppose we define random variable $Y_i$ as the $i$-th smallest $X_k$. For example, $Y_1 = \min(X_1, X_2, \ldots, X_n)$. $Y_2$ is the second smallest $X$.

I understand the following two facts: 1) Exponential random variables are memoryless. That is, if $X$ is an exponential random variable, $P(X \geq t + t_0 | X \geq t_0) = P(X \geq t)$.

and 2) If $X_1, X_2, \ldots, X_k$ are independent, each with parameter $\lambda$, then the random variable $Y_1 = \min(X_1, X_2, \ldots, X_k)$ has the probability density function: $f(y_1) = k \lambda e^{-k \lambda y_1}$

Then, my question is: how do I get the probability density function of $Y_2 - Y_1$? $Y_2 - Y_1$ is the second smallest exponential random variable minus the smallest exponential random variable. I understand that the expected value of $(n-1)$ exponential random variables with the same parameter $\lambda$ is $\frac{1}{(n-1)\lambda}$ (since we can use fact 2 above and set the number of random variables to $k-1$), but I do not see how $E(Y_2-Y_1)$ equals that quantity.

I believe strongly that fact 1 is involved, but I do not see how $Y_2 - Y_1$ is related to $(Y_2 - Y_1 | Y_1 = y_1)$.

Thanks everyone.

  • 0
    This shows up, for example, (oftentimes in slightly disguised form) in standard proofs of the limiting distribution of the [Kolmogorov-Smirnov statistic](http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test#Kolmogorov.E2.80.93Smirnov_test).2011-06-02

1 Answers 1

3

A simple method is to compute the distribution of $(Y_1,Y_2)$. Pick some positive $y_1. What does it take to know that $Y_1\in(y_1,y_1+\mathrm{d}y_1)$ and $Y_2\in(y_2,y_2+\mathrm{d}y_2)$? You have to choose:

(1) the rank of the smallest value, for which there exists $n$ possibilities,

(2) the rank of the second smallest value, for which there exists $n-1$ possibilities,

(3) that these two random variables are indeed in the intervals $(y_1,y_1+\mathrm{d}y_1)$ and $(y_2,y_2+\mathrm{d}y_2)$, which happens with respective probabilities $\lambda \mathrm{e}^{-\lambda y_1}\mathrm{d}y_1$ and $\lambda \mathrm{e}^{-\lambda y_2}\mathrm{d}y_2$,

and (4) finally that the $n-2$ other random variables are greater than $y_2$, which happens with probability $\mathrm{e}^{-(n-2)\lambda y_2}$.

All this yields the probability density function of $(Y_1,Y_2)$ as $ n(n-1)\lambda^2 \mathrm{e}^{-\lambda y_1-(n-1)\lambda y_2}\mathbf{1}_{0 From this, to compute the value of $E(Y_2-Y_1)$ is straightforward. For example, introducing $Z_2=Y_2-Y_1$, one sees that the probability density function of $(Y_1,Z_2)$ is $ n(n-1)\lambda^2 \mathrm{e}^{-n\lambda y_1-(n-1)\lambda z_2}\mathbf{1}_{y_1>0}\mathbf{1}_{z_2>0}. $ This is a product distribution hence $Y_1$ and $Z_2$ are independent with well known distributions, exponential with parameters $n\lambda$ and $(n-1)\lambda$, respectively.

  • 0
    Can I do step (2) (Change of variables) to any p.d.f. or does the substitution have to be linear? So if I have $f(x,y)$, can I create $f(x,z)$, with $z = g(x,y)$, $g()$ being any arbitrary function?2011-06-03