An arbitrary ellipse can be parameterised as $x=h+a\cos t,\, y=k+b\cos t$ $(0\leq t\leq 2\pi)$, which is an ellipse centred at $(h,k)$, with axis lengths $a$ and $b$.
Minimising the square of the distance is an equivalent problem to minimising the distance.
Approach 1:
The point $(x_2,y_2)$ lies on the above parametised ellipse.
Then, $f=(x_1-h-a\cos t)^2 +(x_2-k-b\sin t)$ is the distance to any point on the ellipse.
As $d$ is a simple function of $t$ only, the value of $t$ which is minimised when $\dfrac{d f}{d t}=0$.
Approach 2:
The point $p_1=(x_1,y_1)$ can be written as $p_1=p_2+\lambda n$ where $p_2=(x_2,y_2)$ is the point on the ellipse and $n$ is the vector normal to the ellipse. Using the above parameterisation for the ellipse, you obtain the system of nonlinear equations $\begin{bmatrix}x_1\\y_1\end{bmatrix}=\begin{bmatrix}h+a\cos t\\k+b\sin t\end{bmatrix}+\lambda \begin{bmatrix}b\cos t\\a\sin t\end{bmatrix},$ which has to be solved for $t$ and $\lambda$.
Approach 3:
A third way to formulate this problem is as a constrained optimisation problem, so that this is formulated as $\min\limits_{x_1,x_2} f = (x_1-x_2)^2+(y_1-y_2)^2$ subject to $\frac{(x_2-h)^2}{a^2}+\frac{(y_2-k)^2}{b^2}-1=0.$
This can be solved using lagrange multipliers to give $L(x_1,x_2,\lambda)=(x_1-x_2)^2+(y_1-y_2)^2+\lambda\left(\frac{(x_2-h)^2}{a^2}+\frac{(y_2-k)^2}{b^2}-1\right).$ The optimal solution can be found by solving the system of equations $\frac{d L}{d x_1}=0,\,\frac{d L}{d x_2}=0,\,\frac{d L}{d \lambda}=0.$