Let $X$ be a random variable with $\mathbb{E}|X|<\infty$ and $\mathbb{F}$ a filtration. Put $M_n = \mathbb{E}[X|\mathcal{F}_n], n \in \mathbb{N}_0$. By the tower property of conditional expectation we obtain \begin{align} \mathbb{E}[M_{n+1}|\mathcal{F}_n] = \mathbb{E}[\mathbb{E}[X|\mathcal{F}_{n+1}]|\mathcal{F}_n]=\mathbb{E}[X|\mathcal{F}_n]=M_n. \end{align} The process $M$ is thus a martingale.
Assume further that $X \in \mathcal{L}^2(\Omega,\mathcal{F},\mathbb{P})$. We can interpret $M_n$ as the best prediction of $X$ given the information $\mathcal{F}_n$. Showing that the $v_n:=\mathbb{E}(M_n - X)^2$ are decreasing supports our intuitive understanding that with more information one should be able to predict better.
In doing so, I face some difficulties. \begin{align} v_n :=& \mathbb{E}(M_n-X)^2\\ =& \mathbb{E}M_n^2 - 2 \mathbb{E}M_nX + \mathbb{E}X^2 \qquad (*) \end{align} How to show that $(*) \geq v_{n+1}$?