In response to the OP's request, I give two elaborated examples in order to clarify the difference between convergence in probability (denoted by $\stackrel{{\rm P}}{\to}$) and convergence in distribution (denoted by $\stackrel{{\rm D}}{\to}$).
Example 1. Suppose that $X_i$, $i=1,2,\ldots$, are non-constant random variables taking values in $[0,M]$, and let $(a_n)$ be a sequence of positive numbers such that $\sum\nolimits_{i = 1}^\infty {a_i } = c < \infty $. Define $S_n = \sum\nolimits_{i = 1}^n {a_i X_i }$. Then, since the sequence $S_n$ is monotone increasing, $S_n$ converges pointwise (that is, for all $\omega \in \Omega$) to a random variable $S$ taking values in $[0,Mc]$. Here, we have the strongest type of convergence (sure convergence), which implies all the other kinds of convergence. In particular, as one would expect, $S_n \stackrel{{\rm P}}{\to} S$. Indeed, this can be shown directly as follows. Fix $\varepsilon > 0$. Then, for all sufficiently large $n$, $ {\rm P}(|S_n - S| > \varepsilon) = {\rm P}\bigg(\sum\limits_{i = n + 1}^\infty {a_i X_i } > \varepsilon \bigg) \leq {\rm P}\bigg(M \sum\limits_{i = n + 1}^\infty {a_i } > \varepsilon \bigg) = 0. $ Now, since convergence in probability implies convergence in distribution, $S_n \stackrel{{\rm D}}{\to} S$ as well. However, the limit random variable $S$ plays no special role with regard to the convergence in distribution. Indeed, take, for example, an independent copy $S'$ of $S$. Then, trivially, $S_n \stackrel{{\rm D}}{\to} S'$ (simply because $S$ and $S'$ have the same distribution function). On the other hand, the limit $S$ plays an essential role with regard to the convergence in probability. In fact, it is easy to prove the following general statement: the limit of convergence in probability is unique in the sense that if $Z_n \stackrel{{\rm P}}{\to} X$ and $Z_n \stackrel{{\rm P}}{\to} Y$, then $X = Y$ almost surely, that is ${\rm P}(X \neq Y) = 0$. Finally, it is worth noting that if $Z_n \stackrel{{\rm D}}{\to} Z$, and $Z$ is distributed according to a distribution $F$, then we can write $Z_n \stackrel{{\rm D}}{\to} F$. For example, if $Z_n \stackrel{{\rm D}}{\to} Z$ where $Z \sim {\rm exponential}(\lambda)$, then we can write $Z_n \stackrel{{\rm D}}{\to} {\rm exponential}(\lambda)$.
To further clarify the difference between convergence in probability and convergence in distribution, let's consider the fundamental case of the central limit theorem.
Example 2. Suppose that $X_1,X_2,\ldots$ is a sequence of i.i.d. random variables with expectation $\mu$ and (finite) variance $\sigma^2 > 0$. Define $S_n = X_1 + \cdots + X_n$ and $Z_n = \frac{{S_n - n\mu }}{{\sigma \sqrt n }}$. The central limit theorem states that $Z_n$ converges in distribution to the standard normal distribution ${\rm N}(0,1)$, that is $Z_n \stackrel{{\rm D}}{\to} {\rm N}(0,1)$. So, given any random variable $Z \sim {\rm N}(0,1)$ (which, in particular, may be defined on a different probability space), we can write $Z_n \stackrel{{\rm D}}{\to} Z$. On the other hand, there is no random variable $Z$ such that $Z_n \stackrel{{\rm P}}{\to} Z$. Indeed, suppose for a contradiction that $Z_n \stackrel{{\rm P}}{\to} Z$. It is an easy exercise to show that $ {\rm P}(|Z_n - Z_m | > 2 \varepsilon ) \le {\rm P}(|Z_n - Z| > \varepsilon ) + {\rm P}(|Z_m - Z| > \varepsilon ). $ Now, in order to reach a contradiction, it suffices to realize that $Z_n$ and $Z_m$ become asymptotically independent as $n,m \to \infty$ with $n/m \to 0$; indeed, $ Z_m = \sqrt {\frac{n}{m}} Z_n + \sqrt {\frac{{m - n}}{m}} \frac{{\sum\nolimits_{i = n + 1}^m {X_i } - (m - n)\mu }}{{\sigma \sqrt {m - n} }}, $ from which it is also seen that $ {\rm Cov}(Z_n,Z_m) = \sqrt {\frac{n}{m}}. $
Finally, especially in view of the first example, it is worth noting that convergence in probability, though quite strong relative to convergence in distribution, does not imply almost sure convergence. A short but sophisticated example is given in my answer to this question, at the end of the second paragraph.
EDIT: As I commented below, I intentionally gave the non-trivial example of the central limit theorem. Here are two trivial examples.
First (elaborating on Didier's example), if $X_1,X_2,\ldots$ are i.i.d. from a distribution $F$, then, trivially, $X_n \stackrel{{\rm D}}{\to} F$ (since $X_i \sim F$ for each $i$). But, unless the $X_i$ are deterministic, the sequence never converges in probability. Indeed, suppose that $X_n \stackrel{{\rm P}}{\to} X$. Let $\varepsilon >0 $ be arbitrary but fixed. By the triangle inequality, the event $\lbrace |X_{n+1} - X_n| > 2 \varepsilon \rbrace$ is contained in the event $\lbrace |X_{n+1} - X| > \varepsilon \rbrace \cup \lbrace |X_{n} - X| > \varepsilon \rbrace$. Hence, $ {\rm P}(|X_{n+1}-X_n| > 2 \varepsilon) \leq {\rm P}(|X_{n + 1} - X| > \varepsilon ) + {\rm P}(|X_n - X| > \varepsilon ). $ Since, by our assumption, the right-hand side tends to $0$ as $n \to \infty$, and since $|X_{n+1}-X_n|$ is equal in distribution to $Y:=|X_1 - X_2|$, we get ${\rm P}(Y > 2 \varepsilon) = 0$. Since $Y$ is nonnegative, this implies that $Y = 0$ almost surely (exercise), that is, $X_1 = X_2$ almost surely. Hence, the $X_i$ are deterministic (since they are independent).
As another example, suppose that ${\rm P}(X_1 = 1) = {\rm P}(X_1 = -1) = 1/2$, and let $X_{n+1}=-X_n$. Then, trivially, $X_n \stackrel{{\rm D}}{\to} X_1$, but either $(X_n) = (1,-1,1,-1,\ldots)$ or $(X_n) = (-1,1,-1,1,\ldots)$.