3
$\begingroup$

I'm trying to understand how the Newton's method in optimization works.

This is the algorithm:

$S_0)$ Choose $x_{0}\in \mathbb{R}^{n},\rho>0,\ p>2,\ \beta\in(0,1), \displaystyle \sigma\in\left(0,\frac{1}{2}\right),\ \varepsilon\geq 0$, set $k =0$

$S_1)$ If $\Vert\nabla f(x_{k})\Vert\leq\varepsilon \ \ \ $ STOP

$S_2)$ determine $d_{k}$ with $D^{2}f(x_{k})d_{k}=-\nabla f(x_{k})$ if $d_{k}$ cannot be calculated or the condition

$\nabla f(x_{k})^{T}d_{k}\leq-p\Vert d_{k}\Vert^{p}$

is violated, set $d_{k}=-\nabla f(x_{k})$

$S_3)$ determine $t_{k}\in\{\beta^{j}:j=0,1,2,\ \ldots\}$ maximal with $ f(x_{k}+t_{k}d_{k})

$S_4)$ Set $x_{k+1}=x_{k}+t_{k}d_{k},\ k:=k +1$, go to $S_1)$.

My question:

As I understand, the algorithm presented in my first post generates a sequence xk which converges to a strict local minimum of the twice differentiable function $f$. Why is this algorithm so interesting, if it can't even find the global minimum of the given function?

Thank you very much for your time!

  • 0
    As I understand, the algorithm presented in my first post generates a sequence $x_k$ which converges to a strict local minimum of the twice differentiable function $f$. Why is this algorithm so interesting, if it can't even find the global minimum of the given function?2012-07-18

1 Answers 1

0

It's not easy to find a global minimum of a general function. Many functions have lots of local minima and in principle the algorithm would need to check all of them to find the local a global minimum. This is for example a central problem in Neural Networks learning algorithms. I'm not expert on this, but I guess the algorithm is interesting because it is simple, it works for a relatively general class of functions, and it is fast.

  • 1
    @Chris "How do we know that we have determined all local minima?" - you don't. Functions will lie through their teeth if you give them half a chance. Without some way of ensuring 'honesty' - e.g. bounds on function values or derivatives - you literally cannot control the behavior of the function anywhere that you haven't explicitly looked at it. This is a big part of the reason why global results are so hard.2012-07-18