4
$\begingroup$

If $X$ and $Y$ are two independent random variables with probability density functions $f$ and $g$, respectively, then the probability density of the difference $Y − X$ is given by the cross-correlation . In contrast, the convolution f * g gives the probability density function of the sum $X + Y$.

what i would like to have is the probability density functions of $|y-x|$ or $(y-x)^2$. Is that possible ?

is that the equivalent as doing the following with two histogram ?

 for (bin b1 in Histogram1)   for (bin b2: Histogram2)         prob = b1.probability * b2.probability         distance += prob * abs(b1.value - b2.value) 
  • 0
    **Cross-post** (simultaneous): http://stats.stackexchange.com/questions/12534/calculate-the-probability-density-fonction-of-the-absolute-difference-of-two-rand2011-07-01

3 Answers 3

4

As you said, the density of $X-Y$ is the cross-correlation $h(t) = \int_{-\infty}^\infty f(t+y) g(y) \, dy$. The density of $|X-Y|$ is $f_{|X-Y|}(t) = h(t) + h(-t)$ for $t > 0$, $0$ for $t < 0$. The density of $(X-Y)^2$ is $f_{(X-Y)^2}(t) = (h(\sqrt{t}) + h(-\sqrt{t}))/(2 \sqrt{t})$ for $t > 0$, $0$ for $t < 0$.

2

Expanding on Robert's answer (adapted to $Y-X$, in view of the question).

Let $h(t)=\int_{ - \infty }^\infty {f(x)g(t + x)dx}$, $t \in \mathbb{R}$, be the cross-correlation of the densities $f$ and $g$ (of $X$ and $Y$, respectively), that is the probability density function of $Y-X$. We want to show that, for any $t > 0$, $ {\rm P}(|Y-X| \leq t) = \int_0^t {[h(u) + h( - u)]du} $ (implying that the density of $|Y-X|$ is $h(t)+h(-t)$ for $t > 0$; for $t < 0$ the density is trivially $0$). Indeed, $ \int_0^t {[h(u) + h( - u)]du} = \int_0^t {h(u)du} + \int_0^t {h( - u)du} = \int_0^t {h(u)du} + \int_{ - t}^0 {h(u)du} , $ and so from $ \int_0^t {h(u)du} = {\rm P}(0 \leq Y-X \leq t) \;\; {\rm and} \;\; \int_{ - t}^0 {h(u)du} = {\rm P}(-t \leq Y-X \leq 0), $ it follows that $ \int_0^t {[h(u) + h( - u)]du} = {\rm P}(-t \leq Y-X \leq t) = {\rm P}(|Y-X| \leq t). $ Having shown that $h(t)+h(-t)$ is the density of $|Y-X|$ (for t > 0), the density of $(Y-X)^2$ can be found easily as follows. For any $t > 0$, $ {\rm P}((Y-X)^2 \leq t) = {\rm P}(|Y-X| \leq \sqrt t) = \int_0^{\sqrt t } {[h(u) + h( - u)]du} . $ A change of variable $u \mapsto u^2$ then gives $ {\rm P}((Y-X)^2 \leq t) = \int_0^t {[h(\sqrt u ) + h( - \sqrt u )]\frac{{du}}{{2\sqrt u }}} , $ implying that the density of $(Y-X)^2$ is $(h(\sqrt t)+h(-\sqrt t))/(2\sqrt t)$ for $t > 0$; for $t < 0$ the density is trivially $0$.

EDIT: A key point here is that (a nonnegative measurable function) $f_Z$ is a probability density function of $Z$ if and only if $F_Z (z) = \int_{ - \infty }^z {f_Z (} u)du$ $\forall z \in \mathbb{R}$, where $F_Z$ is the distribution function of $Z$. (In this case, $Z$ is said to be absolutely continuous with density function $f_Z$.)

EDIT: In the case of continuously differentiable distribution functions (rather than the general case absolutely continuous distribution functions), you can obtain the above results simply as follows. Let $H$ denote the distribution function of $Y-X$ and, as above, $h$ its probability density function. Then, for any $t > 0$, $ {\rm P}(|Y-X| \leq t) = {\rm P}(-t \leq Y-X \leq t) = {\rm P}(Y-X \leq t) - {\rm P}(Y-X \leq -t) = H(t) - H(-t). $ Hence $ \frac{d}{{dt}}{\rm P}(|Y - X| \le t) = h(t) + h( - t). $ As for the density function of $(Y-X)^2$, first note that, for any $t > 0$, $ {\rm P}((Y-X)^2 \leq t) = {\rm P}(|Y-X| \leq \sqrt{t}) = H(\sqrt t ) - H( - \sqrt t ). $ Hence $ \frac{d}{{dt}}{\rm P}((Y - X)^2 \le t) = \frac{{h(\sqrt t )}}{{2\sqrt t }} + \frac{{h( - \sqrt t )}}{{2\sqrt t }} = \frac{{h(\sqrt t ) + h( - \sqrt t )}}{{2\sqrt t }}. $

  • 0
    If $X$ and $Y$ are independent random variables with densities $f$ and $g$, respectively, then ${\rm E}|Y - X| = \int {\int {|y - x|f(x)g(y)dxdy} } $. But you can approximate this expectation simply by using ${\rm E}|Y - X| \approx \frac{{\sum\nolimits_{i = 1}^n {|Y_i - X_i |} }}{n}$, for $n$ sufficiently large, where the $X_i$ and $Y_i$ are independent copies of $X$ and $Y$, respectively.2011-07-03
-1

It is a very big question.

Yes you can find the pdf of any combination of variables, there are many ways.

The one which works most of the time use the characteristic function:

http://en.wikipedia.org/wiki/Characteristic_function_(probability_theory)#Basic_manipulations_of_distributions

You have many complex and exotic tools to exploit the properties of CF.

Anyhow this process greatly depends from the type of variable. E.g.: It is much easier with exponential variables than with non exponential variables.