37
$\begingroup$

Suppose $X$ is a real-valued random variable and let $P_X$ denote the distribution of $X$. Then $ E(|X-c|) = \int_\mathbb{R} |x-c| dP_X(x). $ The medians of $X$ are defined as any number $m \in \mathbb{R}$ such that $P(X \leq m) \geq \frac{1}{2}$ and $P(X \geq m) \geq \frac{1}{2}$.

Why do the medians solve $ \min_{c \in \mathbb{R}} E(|X-c|) \, ? $

4 Answers 4

29

For every real valued random variable $X$, $ \mathrm E(|X-c|)=\int_{-\infty}^c\mathrm P(X\leqslant t)\,\mathrm dt+\int_c^{+\infty}\mathrm P(X\geqslant t)\,\mathrm dt $ hence the function $u:c\mapsto \mathrm E(|X-c|)$ is differentiable almost everywhere and, where $u'(c)$ exists, $u'(c)=\mathrm P(X\leqslant c)-\mathrm P(X\geqslant c)$. Hence $u'(c)\leqslant0$ if $c$ is smaller than every median, $u'(c)=0$ if $c$ is a median, and $u'(c)\geqslant0$ if $c$ is greater than every median.

The formula for $\mathrm E(|X-c|)$ is the integrated version of the relations $(x-y)^+=\int_y^{+\infty}[t\leqslant x]\,\mathrm dt$ and $|x-c|=((-x)-(-c))^++(x-c)^+$, which yield, for every $x$ and $c$, $ |x-c|=\int_{-\infty}^c[x\leqslant t]\,\mathrm dt+\int_c^{+\infty}[x\geqslant t]\,\mathrm dt $

  • 0
    @Vim See comment on Oct 18 at 16:11.2016-11-24
9

Let $f$ be the pdf and let $J(c) = E(|X-c|)$. We want to maximize $J(c)$. Note that $E(|X-c|) = \int_{\mathbb{R}} |x-c| f(x) dx = \int_{-\infty}^{c} (c-x) f(x) dx + \int_c^{\infty} (x-c) f(x) dx.$

To find the maximum, set $\frac{dJ}{dc} = 0$. Hence, we get that, $\begin{align} \frac{dJ}{dc} & = (c-x)f(x) | _{x=c} + \int_{-\infty}^{c} f(x) dx + (x-c)f(x) | _{x=c} - \int_c^{\infty} f(x) dx\\ & = \int_{-\infty}^{c} f(x) dx - \int_c^{\infty} f(x) dx = 0 \end{align} $

Hence, we get that $c$ is such that $\int_{-\infty}^{c} f(x) dx = \int_c^{\infty} f(x) dx$ i.e. $P(X \leq c) = P(X > c).$

However, we also know that $P(X \leq c) + P(X > c) = 1$. Hence, we get that $P(X \leq c) = P(X > c) = \frac12.$

EDIT

When $X$ doesn't have a density, all you need to do is to make use of integration by parts. We get that $\displaystyle \int_{-\infty}^{c} (c-x) dP(x) = \lim_{y \rightarrow -\infty} (c-y) P(y) + \displaystyle \int_{c}^{\infty} P(x) dx.$ Similarly, we also get that $\displaystyle \int_{c}^{\infty} (x-c) dP(x) = \lim_{y \rightarrow \infty} (y-c) P(y) - \displaystyle \int_{c}^{\infty} P(x) dx.$

  • 0
    So you are thinking $P$ as cdf of $X$?2011-11-25
7

Let $m$ be any median of $X$. Wlog, we can take $m=0$ (consider $X':=X-m$). The aim is to show $E|X-c|\ge E|X|$.

Consider the case $c\ge 0$. It is straightforward to check that $|X-c|-|X|=c$ when $X\le0$, and $|X-c|-|X|\ge -c$ when $X>0$.

It follows that $ E\left[(|X-c|-|X|)I(X\le0)\right]=cP(X\le0)\tag1 $ and $E\left[(|X-c|-|X|)I(X>0)\right]\ge-cP(X>0).\tag2 $ Adding (1) and (2) yields $ E(|X-c|-|X|)\ge c\left[P(X\le0)-P(X>0)\right].\tag3 $ The RHS of (3) equals $c[2P(X\le0)-1]$, which is non-negative since $c\ge0$ and zero is a median of $X$. The case $c\le0$ is reduced to the previous one by considering $X':=-X$ and $c':=-c$.

3

The following intends to complement Did's answer.

Claim

Denote by $M$ be the set of $X$'s medians. Then

  1. $M = [m_1, m_2]$ for some $m_1, m_2 \in \mathbb{R}$, such that $m_1 \leq m_2$.

  2. For every $m \in M$ and for every $x \in \mathbb{R}$ we have $ E\left(|X-m|\right) \leq E\left(|X-x|\right). $ (In particular, $m\mapsto E\left(|X-m|\right)$ is constant on $M$.)

Part 2's proof builds on Did's answer.

Proof

  1. It is known that $M \neq \emptyset$. Define $ \begin{align} M_1 &:= \left\{t\in\mathbb{R}\ |\!:\ F_X(t) \geq \frac{1}{2}\right\}, \\ M_2 &:= \left\{t\in\mathbb{R}\ |\!:\ P(X Then $M = M_1 \cap M_2$. It therefore suffices to show that $M_1 = [m_1, \infty)$ and that $M_2 = (-\infty, m_2]$, for some $m_1, m_2 \in \mathbb{R}$.

    Since $\lim_{t\rightarrow-\infty}F_X(t) = 0$, $M_1$ is bounded from below. Since $\lim_{t\rightarrow\infty}F_X(t) = 1$, $M_1$ is an interval that extends to infinity. Hence $M_1 = (m_1,\infty)$ or $M_1 = [m_1,\infty)$, for some $m_1 \in \mathbb{R}$. It follows from $F_X$'s right-continuity that $m_1 \in M_1$. An analogous argument shows that $M_2 = (-\infty,m_2]$ (just verify that $t\mapsto P(X is left-continuous).

  2. Define a function $f:\mathbb{R}\rightarrow\mathbb{R}$ as follows. For every $c \in \mathbb{R}$, set $ f(c) := E\left(|X-c|\right). $

    We will begin by showing that $f$ is convex. Let $a, b \in \mathbb{R}$, and let $t \in (0,1)$. Then $ \begin{align} f\left(ta+(1-t)b\right) &= E\left(\left|X-\left(ta+(1-t)b\right)\right|\right) \\ &= E\left(\left|\left(tX-ta\right)+\left((1-t)X-(1-t)b\right)\right|\right) \\ &\leq E\left(\left|\left(tX-ta\right)\right|+\left|\left((1-t)X-(1-t)b\right)\right|\right) \\ &=E\left(\left|\left(tX-ta\right)\right|\right)+E\left(\left|\left((1-t)X-(1-t)b\right)\right|\right) \\ &= t\ E\left(|X-a|\right) + (1-t)\ E\left(|X-b|\right) \\ &= t\ f(a) + (1-t)\ f(b). \end{align} $

    Since $f$ is convex, then, by Theorem 7.40 of [1] (p. 157), there exists a set $A \subseteq \mathbb{R}$ such that $\mathbb{R}\setminus A$ is countable, and such that $f$ is finitely differentiable on $A$. Moreover, letting $m \in M$, and letting $x \in (-\infty, m_1)$, Theorem 7.43 of [1] (p. 158) yields that $f'$ is Lebesgue-integrable on $[x,m] \cap A$, and that $ f(m) - f(x) = \int_{[x,m]\cap A} f'\ d\lambda. $

    Applying Did's answer, we find that $f'\leq 0$ on $[x,m]\cap A$. Hence $f(m) \leq f(x)$. Similar considerations show that, for every $x \in (m_2,\infty)$, $f(m) \leq f(x)$, and also that $f(m) = f(m_1)$ (implying that $f$ is constant on $M$, since $m$ was chosen arbitrarily in $M$).

    (The argument of the last paragraph was suggested to me by copper.hat in their answer to a related question of mine.)

Q.E.D.


References

[1] Richard L. Wheeden and Antoni Zygmund. Measure and Integral: An Introduction to Real Analysis. 2nd Ed. 2015. CRC Press. ISBN: 978-1-4987-0290-4.

  • 0
    indeed. I had been actually mainly reading the link in your answer, which seemed a bit simpler so that I could grasp it with less knowledge basis. The singleton concern arose from there but not from this answer.2016-11-24