3
$\begingroup$

First, a motivating example. Suppose $f(x)$ is convex, differentiable, with a single minimum $x^*$. Then the differential equation $\dot{x}(t) = -\nabla f(x(t))$ drives $x(t)$ to $x^*$.

Now my question is about a generalization of this. Let $f(x), g(x)$ be two smooth convex functions and let ${\cal G}$ be the set of minima of $g(x)$, which we assume to be nonempty. Consider the differential equation $ \dot{x}(t) = - \frac{1}{t} \nabla f(x(t)) - \nabla g(x(t))$ Is it true that this equation drives $x(t)$ to the minimum of $f(x)$ on ${\cal G}$? If not, would it be true if we replaced $1/t$ by a different function, say one which perhaps decays slower or faster?

This statement seems to be true in a few simple examples I tried. For example, taking $g(x)=(x_1+x_2-2)^2$ and $f(x)=x_1^2+x_2^2$ and solving the resulting equation numerically, I do get that solutions seem to approach $(1,1)$.

Update: I asked this question on MO and received a solution. Unfortunately, the solution was given at a level of mathematical maturity that exceeds my own and I am unable to fill in the details of the arguments. Asking the answerer a followup question has not really clarified matters. I am hoping that someone here on could fill in the details.

  • 0
    @LeonidKovalev - indeed, it was silly to suggest that something that decays faster would work. I would, of course, be more than satisfied if this turned out to be true for some $p \in (0,1)$ :)2012-07-03

1 Answers 1

2

I might be missing your point but I don't think this is true. Consider for example the following constrained optimization problem:

$ \min \frac12\|x\|_2^2 \qquad \text{s.t.}\quad Ax=b $ where $A$ has more columns than rows. The set of $x$ such that $Ax=b$ can be seen as the set of minimizers of $g(x)=\frac12\|Ax-b\|_2^2$. Letting $f(x)=\frac12\|x\|^2_2$ with your scheme we'd have the iteration scheme

$x_{k+1} = x_k -{1\over k}x_k - A^*(Ax_k-b)$

Which, unless I'm mistaken, does not produce admissible iterates and does not converge in general to the solution of the problem which can be obtained with KKT conditions (usual "projection on the constraints")

$ x_{\text{opt}} = A^*(AA^*)^{-1}b $

In a more general sense, solving problems of the like:

$ \min f(x) + \iota_P(x)$

where $P$ is a convex set (e.g. the set of minimizers of a function) and $\iota$ is the characteristic function ($0$ if in the set, infinity elsewhere) can be solved by alternating a gradient descent on $f$ and a projection on convex set (POC) (which is nothing but the proximal operator of $\iota_P$). In the previous example, it would yield something like:

$ x_{k+1} = (1-\alpha )x_k + A^*(AA^*)^{-1}(b-(1-\alpha)x_k)$

where $\alpha$ is the stepsize of the gradient step (here if you set it to $1$ you get the solution in one step obviously but put it to -say- $1/2$ and you should observe linear convergence). With the previous remark, this has actually the following form:

$ x_{k+1} = \mathrm{prox}_{\iota_P}(x_k-\alpha \nabla f(x_k))$

which fits in the more general "Forward-Backward splitting" framework.

So long story short, I would be tempted to summarize by saying that if you want to minimize a function over a set with an iterative method, you need a projection step on that set at some point (that's the idea of the $\mathrm{prox}$ btw) and that in your proposed method there is none and hence that there is no guarantee that the points generated by your algorithm are admissible or will ever be (i.e. be in $\mathcal G$).

  • 0
    @robinson, yes you're right (actually in the case that I considered where there's more than one solution, the left inverse does not exist...). However the issue of the convergence of the iterates and the optimality of the fixed point $x_\infty$ remains. I also agree with what Leonid mentions but thought that this presentation could bring another approach to the question.2012-07-04