2
$\begingroup$

I was studying Lagrange multipliers. However, I have some confusion. Let's say I have a function $f(x,y)$ to be minimized and I have some constraints $g(x,y) = 0$.

If I minimize the function $$ L(x,y,\lambda) = f(x,y) + \lambda g(x,y) \>, $$ then how does it include the constraint $g(x,y) = 0$. The book says that if I minimize $L$ with respect to $\lambda$ then it will be equivalent to minimize the function $f(x,y)$ with the constraint $g(x,y)$.

I need some clarifications.

Further it is said that

gradient(f)+ lambda * gradient(g) = 0 ............(1)

leads to

L(x,y,lambda) = f(x,y) + lambda * g(x,y)...........(2)

I didn't get this portion how come equation 1 led to equation 2?

Also I am a bit confused when it comes to inequality constraints like

g(x,y) >= 0

It is being said that f(x,y) will be maximum if its gradient is oriented away from the region g(x,y) > 0 and therefore

gradient(f(x,y)) = - lambda * gradient(g(x,y))

I just didn't get this.

  • 0
    Please remark that the method of Lagrange multipliers simply gives a condition to find **critical points** of $f$ constrained to $g^{-1}(0)$. Free critical points of $L$ needn't be minima.2012-06-24

2 Answers 2

1

Setting the partial derivative of L with respect to lambda f to 0 forces g(x,y)=0. Requiring partial of L with respect x and y to 0 will lead to a local extreme point subject to g(x,y) = 0. Because of the form of L this could be a minimum.

0

I will show you an example if the formulation of the minimization problem was of a single variable as: $f(x)+\lambda g(x)$

Now to find the lambda's first solve a closed form for x by setting the gradient w.r.t x as zero. You will have a closed form for x, containing the lambda's.

Now consider this to be $x^{*}=c(\lambda)$. Now substitute the closed form for $x^{*}$ in the constraint as, $g(x^{*})=0$ and solve for $\lambda$ which would give you a $\lambda$ that can enforce your constraint at the optimal value of $x^*$.

In a statistical modeling scenario though, the $\lambda's$ are estimated by cross-validation if f(.) and g(.) were loss functions required to be optimized over random variables. But I am not sure about the domain of your work.

  • 0
    This answer seems to make quite strong (but, unstated) assumptions about $f$ and $g$2012-06-24
  • 0
    @cardinal, Are you indicating about the statement over the cross-validation and over existence of closed-forms, and over existence of a unique minimum? Also at user31820, What is the range of the function $g(x,y)$? This was a start , and the answer can be modified based on further inputs and discussion.2012-06-24