4
$\begingroup$

Let $f(x):\Re^n\rightarrow \Re$ be a proper and closed convex function. Its Moreau-Yosida regularization is defined as

$F(x)=\min_y\Big(f(y)+\frac{1}{2}\|y-x\|_2^2\Big)$

$\operatorname{Prox}_f(x)=\arg\min_y \Big(f(y)+\frac{1}{2}\|y-x\|_2^2\Big)$

Lots of literature say $F(x)$ is Lipschtiz continuous and give explicitly the expression of $\nabla F(x)$ involving $\operatorname{Prox}_f(x)$. But I have no idea how to calculate $\nabla F(x)$. Can anyone provide a straightforward method? I know Rockafellar's book gives a proof. But it assumes too much prior knowledge. I am wondering if there is a more elementary method to prove the Lipschtiz continuity and calculate its gradient.

  • 0
    You mentioned the formula $\nabla f_{\mu}(x) = \frac{1}{\mu} ( x - \text{prox}_{\mu f}(x)$. In a lot of important applications, the prox operator of $f$ can be evaluated analytically. In other cases, you might have to solve a convex optimization problem in order to evaluate the prox operator of $f$. Vandenberghe's [236c notes](http://www.seas.ucla.edu/~vandenbe/236C/lectures/multmethods.pdf) (especially ch. 15 "multiplier methods") are a good resource for this topic.2013-10-09

2 Answers 2

3

One relevant fact is that if a function $g$ is strongly convex with parameter $\mu$, then its conjugate $g^*$ is differentiable and $\nabla g^*$ is Lipschitz continuous with parameter $\frac{1}{\mu}$. (See ch. 13 "dual decomposition" in Vandenberghe's 236c notes.)

Another fact is that the conjugate of $f_{(\mu)}$ is given by $f_{(\mu)}^*(y) = f^*(y) + \frac{\mu}{2} \|y\|_2^2$. So $f_{(\mu)}^*$ is strongly convex with parameter $\mu$. (See ch. 15 "multiplier methods" in the 236c notes).

It follows that $f_{(\mu)}^{**} = f_{(\mu)}$ is differentiable and its gradient is Lipschitz continuous with parameter $\frac{1}{\mu}$.

To compute the gradient of $f_{(\mu)}$, if the prox operator of $f$ can't be evaluated analytically, another option is to evaluate the prox operator of $f$ by solving a convex optimization problem.

0

You will find the proof of the Lipschitz continuity of $F$ here.

You will not find $\nabla F$ in a straightforward way, i.e., without solving some nonlinear equations. Consider $\nabla f$ as a map from $\mathbb R^n$ to $\mathbb R^n$. Let $\psi$ be the inverse map to $\nabla f$. Then $\nabla F$ is the inverse of $\psi+\mathrm{id}$. Indeed, $\psi$ is the gradient of the Legendre transforms of $f$, which we can call $g$. The Legendre transform converts the infimal convolution of $f$ and $\frac12\|x\|^2$ into the sum of $g$ and $\frac12\|y\|^2$. Then we must take the transform again to return to the $x$-variable. In terms of the gradients this means taking inverses.