1
$\begingroup$

This is taken from Nocedal & Wright's Numerical Optimization 2nd edition, pg 279 Theorem 11.3:

Suppose $r:R^n\rightarrow R^n$ is continuously differentiable in a convex set $D\subset R^n$. Let $x^*\in D$ be a nondegenerate solution of the equation $r(x)=0$. I understand that since $r(x^*)=0$:

$$r(x)=J(x)(x-x^*)+\int_0^1[J(x+t(x^*-x))-J(x)](x-x^*) \, dt$$

According to the proof which I'm following, it states that for some ball $B(x^*,\delta)$ centered at $x^*$, we can obtain the following bound:

$||r(x)||\le 2||J(x^*)||\cdot||(x-x^*)|| + o(||x-x^*||)$.

I'm trying to derive this error estimate, but I'm struggling with arriving at these two terms.

1. I can easily see that

$$||\int_0^1[J(x+t(x^*-x))-J(x)](x-x^*) \, dt||\le ||x-x^*||\int_0^1||J(x+t(x^*-x))-J(x)|| \, dt$$

However, I can't quite see how it (immediately?) follows that this is $o(||x-x^*||)$.

2. If I were to apply a triangle inequality to the expression for $||r(x)||$, I would obtain a term of $||J(x)(x-x^*)||$, which I can bound as $||J(x)(x-x^*)||\le||J(x)||\cdot||x-x^*||$, not as $||J(x^*)||\cdot||(x-x^*)||$. In other words, I can't see how I can bound this by the jacobian evaluated at the root $x^*$.

Any help with this would be greatly appreciated! :)

1 Answers 1

2

You need to use continuity of $J$.

Since $r$ is $C^1$, it follows that $J$ is continuous. So, pick $\epsilon>0$, then there is a $\delta>$ such that if $\|x-x^*\| < \delta$, then $\|J(x)-J(x^*)\| < \epsilon$. Without loss of generality, we may assume that $\epsilon \leq \|J(x^*)\|$ (this is non-zero since $x^*$ is non degenerate, and it will be a convenient choice in a moment).

Now suppose $\|x-x^*\| < \delta$. If $t\in [0,1]$, then $\|x+t(x^*-x) -x\| = t\|x-x^*\| \leq \|x-x^*\| < \delta$, hence we have $\|r(x)\| \leq \|J(x)\| \|x-x^*\| + (\int_0^1 \epsilon \, dt ) \|x-x^*\| = \|J(x)\| \|x-x^*\| + \epsilon \|x-x^*\| $.

We also have $\|J(x)\| \leq \|J(x^*)\| + \|J(x)-J(x^*)\| \leq 2 \|J(x^*)\|$ (by choice of $\epsilon$ above). Combining, we get $$ \|r(x)\| \leq 2 \|J(x^*)\| \|x-x^*\| + \epsilon \|x-x^*\|.$$ Since $\epsilon>0$ was arbitrary (well, at least arbitrarily small), we have the desired conclusion.

  • 0
    Hi copper.hat. In the second line of the second paragraph, you wrote $||x+t(x^*-x) -x||$, but I think you meant to write $||x+t(x^*-x) -x^*||$. Otherwise, $||x+t(x^*-x) -x||=t||x^*-x||$. Do you agree?2012-08-31
  • 0
    Yes indeed, good catch, I have fixed it.2012-08-31
  • 0
    I think I understand it now... The main purpose of the statement in the first line of the second paragraph is to ensure that $||x+t(x^*-x)-x||$ is in the ball so that continuity ensures that we can bound $||J(x+t(x^*-x)-x)-J(x)||$ by epsilon. It's a subtle detail that I just couldn't see at first.2012-08-31
  • 0
    You're answer makes a lot sense now! Just one little question: I'm not quite sure I understand how we can make the assumption that $\epsilon \le ||J(x^*)||$ "without loss of generality". Could you elaborate on that? What allows us to make such a claim?2012-08-31
  • 0
    We have $J(x^*) \neq 0$, hence $\|J(x^*)\| > 0$. By continuity of $J$, for **any** $\epsilon>0$ there exists a $\delta>0$ such that,... So, in this case we pick any $\epsilon \in (0, \|J(x^*)\|]$.2012-08-31
  • 0
    We could also have said pick a $\delta_1>0$ such that if $\|x-x^*\| < \delta_1$, then $\|J(x)\| \leq 2 \|J(x^*\|$ (remember, $\|J(x^*\|$ is just a fixed positive number here). Then to continue the proof, we would require $\|x-x^*\| < \min(\delta, \delta_1)$.2012-08-31
  • 0
    You are very welcome.2012-08-31