Let $x\in X$. We have $P_n(x)=\sum_{j=1}^n\langle x,e_j\rangle e_j$ so , by Bessel-Parseval equality $\lVert P_nx-x\rVert^2=\sum_{j\geq n+1}|\langle x,e_j\rangle|^2.$ As the latest series is convergent we have the result.
Now, let $K\subset H$ compact. Fix $\varepsilon >0$. Then we can find an integer $N$ and $x_1,\dots,x_N\in K$ such that for each $x\in K$, we can find $1\leq k \leq N$ such that $\lVert x-x_k\rVert\leq\varepsilon$. Fix $x\in K$. Then $\lVert P_nx-x\rVert^2=\sum_{j\geq n+1}|\langle x-x_k+x_k,e_j\rangle|^2\leq \varepsilon^2+\max_{1\leq k\leq N}\sum_{j\geq n+1}|\langle x_k,e_j\rangle|^2.$ As the RHS doesn't depend on $x$, we have $\sup_{x\in K}\lVert P_nx-x\rVert^2\leq \varepsilon^2+\max_{1\leq k\leq N}\sum_{j\geq n+1}|\langle x_k,e_j\rangle|^2.$ Now take the $\limsup_{n\to+\infty}$ to get the result.
- If $T$ is compact, then $K:=\overline{T(B(0,1))}$ is compact, so we apply the previous result to this $K$.
Note that the property of approximation of a compact operator by a finite rank operator is true in any Hilbert space, not only in separable ones. To see that, fix $\varepsilon>0$; then take $v_1,\dots,v_N$ such that $T(B(0,1))\subset \bigcup_{j=1}^NB(y_j,\varepsilon)$. Let $P$ the projection over the vector space generated by $\{y_1,\dots,y_N\}$ (it's a closed subspace). Consider $PT$: it's a finite ranked operator. Now take $x\in B(0,1)$. Then pick $j$ such that $\lVert Tx-y_j\rVert\leq\varepsilon$. We also have, as $\lVert P\rVert\leq 1$, that $\lVert PTx-Py_j\rVert\leq \varepsilon$. As $Py_j=y_j$, we get $\lVert PTx-Tx\rVert\leq 2\varepsilon$.