8
$\begingroup$

$\min(f_0(x))$ $\text{s.t. }f_i(x) \le y_i \forall i, i = 1 ,\ldots, m$

$f_i : \text{convex};\quad x : \text{variable}$

It is also considered that $g(y)$ is the optimal value of the problem and $\lambda^*$ is the optimal dual variable.

Then, it is claimed that $g(z) \ge g(y) - \sum_{i=1}^m \lambda^*_i * (z_i - y_i)\tag{1}$

Hence $- \lambda^*$ is a subgradient of $g$ at $y$.

Though the material I am reading from (Basic Rules for Subgradient Calculus, slide at 51:40 mins) claims the proof of (1) is straightforward, still I can't figure out how to derive that. Can anybody help ?

My Approach:

Assuming $z_i = f_i(x)$, I get the Lagrangian dual function as $g(z) = f_0(x) + \sum_{i=1}^{m}\lambda_i (z_i-y_i)$. Since $g(y)$ is the optimal value of the problem and $\lambda^*$ is the optimal dual variable, I can write

$g(y)=f_0(x) + \sum_{i=1}^{m}\lambda^*_i(z_i-y_i)$ or may be $g(y)=f_0(x)$ since $z_i = y_i$. But then I can't figure out how should I use the $\lambda^*$ in the eq (1) which is to be derived.

The unconstrained problem is $g(y) = \inf_z g(z).$ Based on problem definition, it is also true that $g(y) \le g(z)$. But how can (1) be derived from these relations? Or, I am doing something wrong assumptions here?

  • 1
    This is called sensitivity analysis. You will find an answer to your question in almost any book on convex optimization (e.g., http://www.stanford.edu/~boyd/cvxbook). First you need to convince yourself that $g$ is a convex function. Secondly (under Slater's qualification condition), $-\lambda^*$ is a subgradient of $g$.2013-01-30

1 Answers 1

1

I'm a bit surprised there's no full answer for this question (and it may be far too late to provide one). Anyway.

I'm using $\langle a,b\rangle=\sum_i a_i b_i$.

Lagrangian:

$ L(x,\lambda; y) = f_0(x) + \langle \lambda, f(x)-y\rangle $

where $f$ is the vector value function with entries $f_1,\dots,f_m$. The dual of that Lagrangian is

$ H(\lambda; y) = \min_{x} L(x,\lambda;y) \le f_0 (x) + \langle \lambda , f(x)-y\rangle$

for every $x$. To relate with the original problem $g(y)=H(\lambda^\star;y)$ if $\lambda^\star$ is the optimal dual variable.

This means that:

$ g(y) = H(\lambda^\star;y) \le f_0(x) + \langle \lambda^\star,f(x)-y\rangle $

for every $x$. Now let's take $x$ such that $f(x)\le z$ then $z-f(x)\ge 0$ and since the dual variable is non-negative,

$ g(y) \le f_0 (x) + \langle \lambda^\star,f(x)-y\rangle + \langle \lambda^\star,z-f(x)\rangle $

now take $x^\sharp $ a minimizer such that $g(z) = f_0(x^\sharp)$ (and, still, $f(x^\sharp)\le z$ of course) then, rearranging:

$ g(y) \le g(z) + \langle \lambda^\star, z-y\rangle$

rearranging once more concludes the proof.