As we all know, gradient is always perpendicular to the level curve. On the other hand, $\nabla f(a,b) \dot\ h$ where $h=(x-a\ \ \ \ \ \ y-b)^T$, give a tangent hyper plane which is tangent to the point on a surface, you can find in many texts that such hyper plane is in general, $z=f(a,b)+\nabla f(a,b) \dot\ h$ at some point $(a,b,f(a,b))$. For instance, when we are calculating the surface area, we would denote $T_s\triangle s$ and $T_t\triangle t$ as the vector which is spanning a paralleogram to approximate the rectangle $ \triangle s \times \triangle t$ where $T_s =\frac{\partial F}{\partial s} $ and F is the surface. In lagrange multiplier, we know f has extremum point if $\nabla f= \lambda\nabla g$, from this expression, we know the grandient f should be perpendicular to the level set g, where it is the domain of f. From the proof, the idea is if $\nabla f $ is not parallel to $\nabla g$ then there is change in f, but here come the question, why if $\nabla f $ is not parallel to $\nabla g$, then there will be a change in f either increase or decrease. Any explaination to gradients would be appreciate.d
The concept of gradient, related to lagrange multipliers, surface areas, tangent hyper planes
-
0g can be considered as the constraint. – 2012-05-02
3 Answers
This is not rigorous, but might give you some intuition.
If you have a manifold given by $g(x_0, x_1, \ldots) = 0$ (this is the same as $g = c$, just move constant $c$ inside $g$), then near some point $P$ it can be approximated well by tangent space $T_P$ calculated in that point. Then, if you would want to maximize $f|_{T_P}$ (i.e. the function $f$ constrained to space $T_P$), you need to calculate the derivative, however, observe that the gradient of $f|_{T_P}$ is just the orthogonal projection of gradient of unconstrained $f$ onto $T_P$ (derivative is a linear transformation). But then, the condition $\nabla(f|_{T_P}) = 0$ can be rewritten to $\pi_{T_P}(\nabla f) = 0$ which is precisely $ \nabla f \| \nabla g$.
All this also can be said in terms of projections "along the normals" with no "tangent space step" between, however, there are some issues, so I won't elaborate on that. Please note, this is only intuition.
Hope that helps!
Okay, some background on the gradient. Since $\gamma$ carves out the level set $g(x,y)=c$, we clearly must have $g\circ\gamma=c$ identically. The derivative of a constant is zero, so we obtain
$\frac{d}{dt}g\circ\gamma=(\nabla g\circ\gamma)\cdot\gamma\,'=0.$
As above, when two vectors' dot product is zero they are perpendicular, thus $\nabla g$ is perpendicular to the tangent of the curve at any point on the curve. This is all in $\Bbb R^2$, two dimensions. Now go to $\Bbb R^3$.
Recall that a plane's equation is determined by its normal vector and one point. Indeed, if $n$ is the normal and $p$ a point on a plane $\pi$, then $\pi-p$ (the plane translated by $-p$) is a parallel plane (so it has the same normal) containing the origin, and therefore has equation $n\cdot x=0$ (the plane is defined to be all points perpendicular to given normal). Therefore $\pi$ has equation $n\cdot(x-p)=0$.
Now consider the graph of $g:\Bbb R^2\to\Bbb R$ as a surface in $\Bbb R^3$. The function is given by
$h:(x,y)\mapsto(x,y,g(x,y)).$
Two vectors tangent to the surface can be obtained partial differentiation
$\frac{\partial h}{\partial x}=\left(1,0,\frac{\partial g}{\partial x}\right), \qquad \frac{\partial h}{\partial y}=\left(0,1,\frac{\partial g}{\partial y}\right).$
A vector is normal to the plane iff it is normal to both of the above, and we can check that
$n=\left(\frac{\partial g}{\partial x},\frac{\partial g}{\partial y},-1\right)$
satisfies this. Our equation of a tangent plane is therefore
$\left(\frac{\partial g}{\partial x},\frac{\partial g}{\partial y},-1\right)\cdot\big((x,y,z)-(a,b,g(a,b))\big)=0.$
Rearranging, we obtain
$z=g(a,b)+\nabla g(a,b)\cdot\big((x,y)-(a,b)\big),$
where now the gradient and dot product above are two dimensional.
We'll assume $\Bbb R^2$ so that level sets $g(x,y)=c$ are indeed contours. Thus such a level curve can be parametrized by $\gamma:[a,b]\to \Bbb R^2$ such that $g\circ\gamma=c$ (or at least a single component of the full level set anyway). Of course we can also look at the values of $f$ while we travel along the line, and turn this problem into a one-dimensional one by involving $t$. Indeed, $f$ restricted to the curve is $f\circ\gamma$, and just like in usual calculus a nonzero derivative (wrt $t$) signifies a nonzero rate of change.
We have, explicitly,
$\frac{d}{dt}f\circ\gamma= (\nabla f\circ\gamma)\cdot\gamma\,'=0,$
which geometrically implies $\nabla f \perp \gamma\,'$ at the pertinent $t=t_0$ value. Remember that $\gamma\,'$ is the tangent vector of the curve, and $\nabla g$ is perpendicular to the curve, so we also have $\nabla g\perp\gamma\,'$ here. Since we're only in two dimensions, if both $\nabla f$ and $\nabla g$ are perpendicular to $\gamma\,'$, they are parallel; $\nabla f\,\|\nabla g$.
-
0@Mathematics: Please look back at the words in your original post. The **only** discernable question is *"why if $\nabla f$ is not parallel to $\nabla g$, then there will be a change in$f$either increase or decrease"*. There is no hint of confusion over the information in the first few sentences of your post. Please keep in mind potential answerers are not mind-readers and cannot address questions you don't clearly express for us! :-( – 2012-05-02