3
$\begingroup$

Suppose $\bar{X}_n$ is the mean of a random sample of size ${n}$ from an exponential distribution with $\lambda$ > 0. Then what does the following statement about convergence mean (how does this converge)?

$\text{exp} \left(-\frac{1}{\bar{X}_n} \right) \xrightarrow{\rm{P}} \text{exp}(-\lambda) $

More specifically what does the $\rm{P}$ on top of the arrow mean? I understand it's probability but what is the difference between some expression converging to some value in probability versus in distribution?

  • 0
    @shai: I think you forgot to answer yest but please do whenever you get time. Thanks2011-02-10

2 Answers 2

7

We say that $X_n\to X$ in distribution if $F_n(x)\to F(x)$ for all continuity points $x$ of $F$. Here, $F_n$ is the distribution function of $X_n$ and $F$ the distribution function of $X$.

On the other hand, $X_n\to X$ in probability means that for every $\varepsilon >0$, we have $P(|X_n-X|>\varepsilon)\to 0\mbox{ as } n\to\infty.$

The two concepts are similar, but not quite the same. In fact, convergence in probability is stronger, in the sense that if $X_n\to X$ in probability, then $X_n\to X$ in distribution. It doesn't work the other way around though; convergence in distribution does not guarantee convergence in probability.

As to your question "How?", the answer is that for any continuous function $g$, if $X_n\to X$ then $g(X_n)\to g(X)$. This holds for both convergence in distribution and for convergence in probability.

Since $\bar X_n\to1/\lambda$ in probability by the weak law of large numbers, it follows that $\exp(-1/\bar X_n)\to \exp(-\lambda)$ in probability as $n\to \infty$.

  • 0
    Can you explain what this scaling factor mean and how this helps for $\sigma^2 \rightarrow 0$ ?2011-02-09
4

In response to the OP's request, I give two elaborated examples in order to clarify the difference between convergence in probability (denoted by $\stackrel{{\rm P}}{\to}$) and convergence in distribution (denoted by $\stackrel{{\rm D}}{\to}$).

Example 1. Suppose that $X_i$, $i=1,2,\ldots$, are non-constant random variables taking values in $[0,M]$, and let $(a_n)$ be a sequence of positive numbers such that $\sum\nolimits_{i = 1}^\infty {a_i } = c < \infty $. Define $S_n = \sum\nolimits_{i = 1}^n {a_i X_i }$. Then, since the sequence $S_n$ is monotone increasing, $S_n$ converges pointwise (that is, for all $\omega \in \Omega$) to a random variable $S$ taking values in $[0,Mc]$. Here, we have the strongest type of convergence (sure convergence), which implies all the other kinds of convergence. In particular, as one would expect, $S_n \stackrel{{\rm P}}{\to} S$. Indeed, this can be shown directly as follows. Fix $\varepsilon > 0$. Then, for all sufficiently large $n$, $ {\rm P}(|S_n - S| > \varepsilon) = {\rm P}\bigg(\sum\limits_{i = n + 1}^\infty {a_i X_i } > \varepsilon \bigg) \leq {\rm P}\bigg(M \sum\limits_{i = n + 1}^\infty {a_i } > \varepsilon \bigg) = 0. $ Now, since convergence in probability implies convergence in distribution, $S_n \stackrel{{\rm D}}{\to} S$ as well. However, the limit random variable $S$ plays no special role with regard to the convergence in distribution. Indeed, take, for example, an independent copy $S'$ of $S$. Then, trivially, $S_n \stackrel{{\rm D}}{\to} S'$ (simply because $S$ and $S'$ have the same distribution function). On the other hand, the limit $S$ plays an essential role with regard to the convergence in probability. In fact, it is easy to prove the following general statement: the limit of convergence in probability is unique in the sense that if $Z_n \stackrel{{\rm P}}{\to} X$ and $Z_n \stackrel{{\rm P}}{\to} Y$, then $X = Y$ almost surely, that is ${\rm P}(X \neq Y) = 0$. Finally, it is worth noting that if $Z_n \stackrel{{\rm D}}{\to} Z$, and $Z$ is distributed according to a distribution $F$, then we can write $Z_n \stackrel{{\rm D}}{\to} F$. For example, if $Z_n \stackrel{{\rm D}}{\to} Z$ where $Z \sim {\rm exponential}(\lambda)$, then we can write $Z_n \stackrel{{\rm D}}{\to} {\rm exponential}(\lambda)$.

To further clarify the difference between convergence in probability and convergence in distribution, let's consider the fundamental case of the central limit theorem.

Example 2. Suppose that $X_1,X_2,\ldots$ is a sequence of i.i.d. random variables with expectation $\mu$ and (finite) variance $\sigma^2 > 0$. Define $S_n = X_1 + \cdots + X_n$ and $Z_n = \frac{{S_n - n\mu }}{{\sigma \sqrt n }}$. The central limit theorem states that $Z_n$ converges in distribution to the standard normal distribution ${\rm N}(0,1)$, that is $Z_n \stackrel{{\rm D}}{\to} {\rm N}(0,1)$. So, given any random variable $Z \sim {\rm N}(0,1)$ (which, in particular, may be defined on a different probability space), we can write $Z_n \stackrel{{\rm D}}{\to} Z$. On the other hand, there is no random variable $Z$ such that $Z_n \stackrel{{\rm P}}{\to} Z$. Indeed, suppose for a contradiction that $Z_n \stackrel{{\rm P}}{\to} Z$. It is an easy exercise to show that $ {\rm P}(|Z_n - Z_m | > 2 \varepsilon ) \le {\rm P}(|Z_n - Z| > \varepsilon ) + {\rm P}(|Z_m - Z| > \varepsilon ). $ Now, in order to reach a contradiction, it suffices to realize that $Z_n$ and $Z_m$ become asymptotically independent as $n,m \to \infty$ with $n/m \to 0$; indeed, $ Z_m = \sqrt {\frac{n}{m}} Z_n + \sqrt {\frac{{m - n}}{m}} \frac{{\sum\nolimits_{i = n + 1}^m {X_i } - (m - n)\mu }}{{\sigma \sqrt {m - n} }}, $ from which it is also seen that $ {\rm Cov}(Z_n,Z_m) = \sqrt {\frac{n}{m}}. $

Finally, especially in view of the first example, it is worth noting that convergence in probability, though quite strong relative to convergence in distribution, does not imply almost sure convergence. A short but sophisticated example is given in my answer to this question, at the end of the second paragraph.

EDIT: As I commented below, I intentionally gave the non-trivial example of the central limit theorem. Here are two trivial examples.

First (elaborating on Didier's example), if $X_1,X_2,\ldots$ are i.i.d. from a distribution $F$, then, trivially, $X_n \stackrel{{\rm D}}{\to} F$ (since $X_i \sim F$ for each $i$). But, unless the $X_i$ are deterministic, the sequence never converges in probability. Indeed, suppose that $X_n \stackrel{{\rm P}}{\to} X$. Let $\varepsilon >0 $ be arbitrary but fixed. By the triangle inequality, the event $\lbrace |X_{n+1} - X_n| > 2 \varepsilon \rbrace$ is contained in the event $\lbrace |X_{n+1} - X| > \varepsilon \rbrace \cup \lbrace |X_{n} - X| > \varepsilon \rbrace$. Hence, $ {\rm P}(|X_{n+1}-X_n| > 2 \varepsilon) \leq {\rm P}(|X_{n + 1} - X| > \varepsilon ) + {\rm P}(|X_n - X| > \varepsilon ). $ Since, by our assumption, the right-hand side tends to $0$ as $n \to \infty$, and since $|X_{n+1}-X_n|$ is equal in distribution to $Y:=|X_1 - X_2|$, we get ${\rm P}(Y > 2 \varepsilon) = 0$. Since $Y$ is nonnegative, this implies that $Y = 0$ almost surely (exercise), that is, $X_1 = X_2$ almost surely. Hence, the $X_i$ are deterministic (since they are independent).

As another example, suppose that ${\rm P}(X_1 = 1) = {\rm P}(X_1 = -1) = 1/2$, and let $X_{n+1}=-X_n$. Then, trivially, $X_n \stackrel{{\rm D}}{\to} X_1$, but either $(X_n) = (1,-1,1,-1,\ldots)$ or $(X_n) = (-1,1,-1,1,\ldots)$.

  • 0
    Sure, no problem.2011-02-14