The rough outline of the proof is:
If $A$ and $T$ are square matrices, and $T$ is invertible, then the minimal polynomial for $A$ is the same as the minimal polynomial for $TAT^{-1}$.
If $B$ is a square matrix composed of a series of square matrices, $B_1,...,B_k$ along the diagonal and zero elsewhere, then the minimal polynomial for $B$ is the least common multiple of the minimal polynomials for the $B_i$.
If $J_m(\lambda)$ is an $m\times m$ Jordan block - $\lambda$ on the diagonal and $1$ just above the diagonal - then the minimal polynomial for $J_m(\lambda)$ is $p(x)=(x-\lambda)^m$.
So, for any $A$, you can find a $T$ so that $B=TAT^{-1}$ is in Jordan Normal Form. We know that the minimal polynomial for $B$ is the least common multiple of the Jordan blocks of $B$. And we know the minimal polynomials for the Jordan blocks, which yields the result.
(1) is easy to prove.
(2) is easy if you realize that if $B$ is of this form, and $p$ is any polynomial, then $p(B)$ is composed of $p(B_1),...,p(B_k)$ along the dialogal and zero elsewhere.
(3) Takes a little bit of arithmetic. First prove it for $\lambda=0$.