So I know that it can proved via modular forms by expressing the discriminant function in terms of the Eisenstein series $E_4^3$ and $E_12$ and reducing the equation using mod 691 (twice)
But this wasn't the original proof right? Did the original proof use elementary number theory? what year was it proved?
Modular forms simplifies the proof somewhat right?
Many thanks