2
$\begingroup$

I am looking at the following paper to implement dual decomposition for my algorithm: http://www.csd.uoc.gr/~komod/publications/docs/DualDecomposition_PAMI.pdf

On Pg.29 they suggest setting the step size for the sub-gradient method by taking the difference of the best primal solution and current dual solution and dividing by the L2-norm of the sub-gradient at current iteration.

My doubt is the following: Do I use sub-gradients for each slave problem and compute a different step-size for each slave problem? Or is there some way I can compute the sub-gradient for the combined dual problem?

2 Answers 2

3

Step-sizes are the crucial and difficult point when using subgradient methods. Basically you need that the step-sizes tend to zero but not too fast. If one uses a-priori step-sizes (e.g. of the form $1/k$) then the method provably converges but in practice it will slow down such that you'll not observe convergence.

The dynamic rule they suggest in the paper (in equation (40)) look like so-called Polyak step-sizes with estimation of the true optimal values (obtained by values of the dual problem). One can prove convergence with these step-sizes under special conditions. I do not know a good reference off the top of my head but many books on (nonsmooth) convex optimization should treat this.

  • 0
    What do you mean my "decomposed into multiple slace problems"? Probably you can form a subgradient of the "total" dual problem from the subgradients of the "slave probelms"?2012-07-17
0

I can not found where Stephen Boyd said it during EE364B class at Stanford, but you can not really increase speed of (sub)gradient methods tremendously. Slides:

https://stanford.edu/class/ee364b/lectures/subgradients_slides.pdf https://stanford.edu/class/ee364b/lectures/subgrad_method_slides.pdf

Videos: You can look for Lecture1-Lecture3 of EE364B with S.Boyd in youtube or itunes university date back to EE364B. They are about subgradient methods for unconstrained convex optimization problems.

You will see various things, including various things. Also Stephen Boyd has interesting/fun teaching style.

In paper that you mentioned steps size is interesting. When I have a deal with first order primal-dual methods I used usual gradient descend with backtracking.

But during this videos you will observed that you "can do whatever you want" until you minimize Lyapunov (and other names) function which measures or certifies your progress with each step (like distance to optimal point, or value of function at current point)