$f$ is assumed to be a holomorphic function that maps the disk to itself, not necessarily an automorphism. Let $a$ be in the unit disk, and let $b=f(a)$. Let $\phi_a(z)=\frac{a-z}{1-\overline az}$ be the holomorphic automorphism of the disk that swaps $a$ and $0$, and similarly $\phi_b(z)=\frac{b-z}{1-\overline b z}$. If $g = \phi_b\circ f\circ \phi_a$, then $g$ is a holomorphic function that maps the disk to itself, and $g(0)=0$, so by Schwarz (or Cauchy's estimate) |g'(0)|\leq 1. Since \phi_a'(0)=|a|^2-1 and \phi_b'(b)=\frac{1}{|b|^2-1}, this yields by the chain rule \frac{1}{1-|b|^2}|f'(a)|(1-|a|^2)\leq 1, or \frac{|f'(a)|}{1-|b|^2}\leq \frac{1}{1-|a|^2}. Since $a$ was arbitrary and $b=f(a)$, this means that \frac{|f'(z)|}{1-|f(z)|^2}\leq \frac{1}{1-|z|^2} for all $z$ in the disk. If for some $a$ the inequality had been equality, then by Schwarz we would have $g(z)=cz$ for some $c$ with $|c|=1$. This means that $f=\phi_b\circ g\circ \phi_a$ is a composition of three Möbius transformations and automorphisms of the unit disk, hence is one itself.