4
$\begingroup$

Can the non-linear conjugate gradient optimization method with Polak-Ribier line-search choice, be named as a quasi-Newton optimization technique?

If not, why?

2 Answers 2

3

Nonlinear conjugate gradient methods are equivalent to memoryless BFGS quasi-Newton methods under the assumption that you perform an exact line search. This was stated in a 1978 paper by Shanno (he's the S in BFGS). Here's the relevant paragraph:

enter image description here

See http://dx.doi.org/10.1287/moor.3.3.244 for the whole paper.

  • 0
    How can I talk of approximating the Hessian, if I restart the approximation at every step? I rather would interpret the cited passage as a suggestion that BFGS might perform better than nonlinear conjugate gradient methods (ignoring memory consumption), because its a true quasi-Newton method.2014-05-07
  • 0
    You have an approximate Hessian at every step. The quoted passage doesn't tell the whole story. The paper establishes that memoryless BFGS = nonlinear conjugate gradient.2014-05-07
0

A Newton based optimization method tries to find the roots of the gradient by Newton's method. This involves evaluating the Hessian and solving linear systems involving the Hessian. A quasi-Newton optimization technique use some approximation for the Hessian, or at least can be interpreted as implicitly using some approximation for the Hessian.

The non-linear conjugate gradient optimization method is not a quasi-Newton technique, because there is no Hessian involved, not even implicitly.

  • 0
    Thank you, this question is what I was looking for.2012-06-14
  • 0
    Sorry but this answer is erroneous.2014-05-07
  • 0
    @Dominique Interesting. Can you also tell me which part is erroneous? My guess is that you disagree with: "A quasi-Newton optimization technique use some approximation for the Hessian, or at least can be interpreted as implicitly using some approximation for the Hessian." Or do you disagree with my last "... not even implicitly"? In that case, could you please help me to also recognize the implicit Hessian you have in mind?2014-05-07
  • 1
    @ThomasKlimpel It's the last paragraph I don't agree with. As the paper below shows, limited-memory BFGS methods coincide with the nonlinear CG provided you perform exact line searches. L-BFGS has a memory parameter. If you set it to zero, you obtain a specific update of the Hessian approximation.2014-05-07
  • 0
    I just hit this page again. For the record, let me also say that the statement "A Newton based optimization method tries to find the roots of the gradient by Newton's method." only holds for convex problems. For nonconvex problems, optimization and nonlinear equations are two different things.2016-03-18
  • 0
    @Dominique Very enlightening to know this about nonlinear CG. Can you elaborate on your comment about Newton's method for nonconvex problems? Even for nonconvex problems, Newton's method for optimization can be interpreted as solving $\nabla f(x) = 0$ using Newton's method for nonlinear equations.2017-04-14
  • 0
    @littleO What I mean is that when you apply Newton's method for nonconvex optimization and in fact also for convex optimization, you also care about decreasing the objective. So you supplement Newton's method with a linesearch that computes a step size $t$ to that $f(x_k + t d_k)$ is sufficiently smaller than $f(x_k)$.2017-04-14