I was reading this book related to optimizing a function f(x,y) with constraints g(x,y) = 0. The book says that let us suppose g(x,y) define a surface, then gradient of the g(x,y) will be orthogonal to the surface. But I am a bit confused how is it possible. I mean gradient always points in the direction of maximum increase. It is not necessary for it to be orthogonal.
For eg lets take the function
f(x) = x^2
If I take the gradient of the function at x=1, then its value is 2i with i giving the direction. This direction is not perpendicular to the function or its tangent at the point (1,f(1)).
I am a bit confused. So any clarifications?