As we all know, gradient is always perpendicular to the level curve. On the other hand, $\nabla f(a,b) \dot\ h$ where $h=(x-a\ \ \ \ \ \ y-b)^T$, give a tangent hyper plane which is tangent to the point on a surface, you can find in many texts that such hyper plane is in general, $z=f(a,b)+\nabla f(a,b) \dot\ h$ at some point $(a,b,f(a,b))$. For instance, when we are calculating the surface area, we would denote $T_s\triangle s$ and $T_t\triangle t$ as the vector which is spanning a paralleogram to approximate the rectangle $ \triangle s \times \triangle t$ where $T_s =\frac{\partial F}{\partial s} $ and F is the surface. In lagrange multiplier, we know f has extremum point if $\nabla f= \lambda\nabla g$, from this expression, we know the grandient f should be perpendicular to the level set g, where it is the domain of f. From the proof, the idea is if $\nabla f $ is not parallel to $\nabla g$ then there is change in f, but here come the question, why if $\nabla f $ is not parallel to $\nabla g$, then there will be a change in f either increase or decrease. Any explaination to gradients would be appreciate.d
The concept of gradient, related to lagrange multipliers, surface areas, tangent hyper planes
-
0What is $g$? .. – 2012-05-02
-
0@sai: The Lagrange problem in this context can be stated as: maximize $f(x,y)$ subject to $g(x,y)=c$. – 2012-05-02
-
0g can be considered as the constraint. – 2012-05-02
3 Answers
This is not rigorous, but might give you some intuition.
If you have a manifold given by $g(x_0, x_1, \ldots) = 0$ (this is the same as $g = c$, just move constant $c$ inside $g$), then near some point $P$ it can be approximated well by tangent space $T_P$ calculated in that point. Then, if you would want to maximize $f|_{T_P}$ (i.e. the function $f$ constrained to space $T_P$), you need to calculate the derivative, however, observe that the gradient of $f|_{T_P}$ is just the orthogonal projection of gradient of unconstrained $f$ onto $T_P$ (derivative is a linear transformation). But then, the condition $\nabla(f|_{T_P}) = 0$ can be rewritten to $\pi_{T_P}(\nabla f) = 0$ which is precisely $ \nabla f \| \nabla g$.
All this also can be said in terms of projections "along the normals" with no "tangent space step" between, however, there are some issues, so I won't elaborate on that. Please note, this is only intuition.
Hope that helps!
Okay, some background on the gradient. Since $\gamma$ carves out the level set $g(x,y)=c$, we clearly must have $g\circ\gamma=c$ identically. The derivative of a constant is zero, so we obtain
$$\frac{d}{dt}g\circ\gamma=(\nabla g\circ\gamma)\cdot\gamma\,'=0.$$
As above, when two vectors' dot product is zero they are perpendicular, thus $\nabla g$ is perpendicular to the tangent of the curve at any point on the curve. This is all in $\Bbb R^2$, two dimensions. Now go to $\Bbb R^3$.
Recall that a plane's equation is determined by its normal vector and one point. Indeed, if $n$ is the normal and $p$ a point on a plane $\pi$, then $\pi-p$ (the plane translated by $-p$) is a parallel plane (so it has the same normal) containing the origin, and therefore has equation $n\cdot x=0$ (the plane is defined to be all points perpendicular to given normal). Therefore $\pi$ has equation $n\cdot(x-p)=0$.
Now consider the graph of $g:\Bbb R^2\to\Bbb R$ as a surface in $\Bbb R^3$. The function is given by
$$h:(x,y)\mapsto(x,y,g(x,y)).$$
Two vectors tangent to the surface can be obtained partial differentiation
$$\frac{\partial h}{\partial x}=\left(1,0,\frac{\partial g}{\partial x}\right), \qquad \frac{\partial h}{\partial y}=\left(0,1,\frac{\partial g}{\partial y}\right).$$
A vector is normal to the plane iff it is normal to both of the above, and we can check that
$$n=\left(\frac{\partial g}{\partial x},\frac{\partial g}{\partial y},-1\right)$$
satisfies this. Our equation of a tangent plane is therefore
$$\left(\frac{\partial g}{\partial x},\frac{\partial g}{\partial y},-1\right)\cdot\big((x,y,z)-(a,b,g(a,b))\big)=0.$$
Rearranging, we obtain
$$z=g(a,b)+\nabla g(a,b)\cdot\big((x,y)-(a,b)\big),$$
where now the gradient and dot product above are two dimensional.
We'll assume $\Bbb R^2$ so that level sets $g(x,y)=c$ are indeed contours. Thus such a level curve can be parametrized by $\gamma:[a,b]\to \Bbb R^2$ such that $g\circ\gamma=c$ (or at least a single component of the full level set anyway). Of course we can also look at the values of $f$ while we travel along the line, and turn this problem into a one-dimensional one by involving $t$. Indeed, $f$ restricted to the curve is $f\circ\gamma$, and just like in usual calculus a nonzero derivative (wrt $t$) signifies a nonzero rate of change.
We have, explicitly,
$$\frac{d}{dt}f\circ\gamma= (\nabla f\circ\gamma)\cdot\gamma\,'=0,$$
which geometrically implies $\nabla f \perp \gamma\,'$ at the pertinent $t=t_0$ value. Remember that $\gamma\,'$ is the tangent vector of the curve, and $\nabla g$ is perpendicular to the curve, so we also have $\nabla g\perp\gamma\,'$ here. Since we're only in two dimensions, if both $\nabla f$ and $\nabla g$ are perpendicular to $\gamma\,'$, they are parallel; $\nabla f\,\|\nabla g$.
-
0Thx for explaination! Can you briefly talk about grandient? That's the part most confuesed me. – 2012-05-02
-
0@Mathematics: Can you be more specific $-$ what about the gradient would you like to know? The condition that $f$ (restricted to the curve) has a local extrema at $t$ (according to the $\gamma$ parametriz-ation) *implies* that $\nabla f$ and $\nabla g$ are parallel. By [contraposition](http://en.wikipedia.org/wiki/Contraposition), if the gradients are not parallel, the derivative $(f\circ\gamma)'$ does not vanish. In predicate language, $A\implies B$ is equivalent to $\neg B\implies\neg A$. Is that your question? Or do you want to know why e.g. $\nabla g\perp\gamma\,'$? – 2012-05-02
-
0actually my question isn't only stick to lagrange multiplier, its all about gradient applie in different cases. I understand your explaination to lagrange multiplier but what i dont quite get is gradient. Like gradient is always perpendicular to the level curve, but gradient can also be used to form a tangent hyper plane which is tangent to some point on some surface curves. THat's what confused me as i couldn't figure out how should one interpret gradient. You may read the question neglecting the part about lagrange multiplier. Thx in advance! – 2012-05-02
-
0@Mathematics: Please look back at the words in your original post. The **only** discernable question is *"why if $\nabla f$ is not parallel to $\nabla g$, then there will be a change in f either increase or decrease"*. There is no hint of confusion over the information in the first few sentences of your post. Please keep in mind potential answerers are not mind-readers and cannot address questions you don't clearly express for us! :-( – 2012-05-02