I found a derivation of Normal Equation without the use of calculus from Wikipedia (Linear Least Squares). This method rewrites the Residual Sum Squared Error:
$$S(\beta)=y^{T}y-2\beta X^{T}y+\beta^{T}X^{T}X\beta$$
into:
$$S(\beta)=\left \langle \beta, \beta \right \rangle-2\left \langle \beta, (X^{T}X)^{-1}X^{T}y \right \rangle+\left \langle (X^{T}X)^{-1}X^{T}y, (X^{T}X)^{-1}X^{T}y \right \rangle+C,$$
and $\langle \cdot ,\cdot \rangle$ is the inner product defined by $$ \langle x,y\rangle =x^{\rm {T}}(\mathbf {X} ^{\rm {T}}\mathbf {X} )y. $$
I understand the idea is to rewrite $S(\beta)$ into the form of $S(\beta)=(x-a)^2+b$ such that $x$ can be solved exactly. But I do not understand how to rewrite $S(\beta)$ and what principals are used to rewrite $S(\beta)$?