3
$\begingroup$

Let $f\in C^1(\mathbb R^n\to \mathbb R)$ be a convex function. Suppose the equation

$f(x+\Delta x)-f(x)-\langle f'(x),\Delta x\rangle\leq A|\Delta x|^2$

holds for some constant $A>0$, any $x\in \mathbb R^n$ and any sufficiently small $\Delta x\in \mathbb R^n$. Is it true that the derivative $f'(x)$ must be a (locally) lipschitz function?

Only case $n>1$ is interesting (case $n=1$ can be easily verified).

2 Answers 2

3

Yes. Fix a point $p\in \mathbb R^n$. We want to shows that $|\nabla f(x)-\nabla f(p)|\le C|x-p|$ for some constant $C$, when $|x-p|$ is small. To simplify notation, replace $f$ with $f(x)-f(p)-\langle x,f'(p)\rangle$ -- this only subtracts a constant from the derivative and therefore does not change its continuity. Denoting this new function by $f$ again, we see that $f(p)=0$ and $\nabla f(p)=0$. Convexity implies $f\ge 0$.

Our assumption now says that $f(x)\le A|x-p|^2$ when $|x-p|\le r$. Take any point $q$ such that $|q-p|\le r/2$ and $\nabla f(q)\ne 0$. Let $v=\frac{\nabla f(q)}{|\nabla f(q)|}$. Define $\phi(t)=f(q+tv)$ for $t\in\mathbb R$. This is simply the restriction of $f$ to this line through $q$ in direction $v$. We have $0\le \phi(t)\le Ar^2$ when $|t|\le r/2$. This and the convexity of $\phi$ imply $|\phi'(0)|\le \frac{Ar^2}{r/2}=2Ar$. On the other hand, $\phi'(0)=|\nabla f(q)|$ by construction of $\phi$. Thus, $|\nabla f(q)|\le 2Ar$ whenever $|q-p|\le r/2$, which is the desired Lipschitz property.

  • 2
    Some things I don't like: 1) opening an "unanswered" questions bumped by Community only to find a satisfactory answer within; 2) finding that I posted the answer.2012-06-16
1

Starting from $f(y) - f(x) - \langle \nabla f(x), y-x\rangle \le \frac{A}{2}\lVert y -x \rVert^2$, we plug into $y \gets x + t(\nabla f(y) - \nabla f(x))$, obtaining that $f\left(x+t(\nabla f(y) - \nabla f(x))\right) - f(x) - t\langle \nabla f(x), \nabla f(y) - \nabla f(x)\rangle \le \frac{A}{2}t^2\lVert \nabla f(y) - \nabla f(x)\rVert^2.$ Since $f$ is convex, we have $f(x+t(\nabla f(y) - \nabla f(x))) \ge f(y) + \langle \nabla f(y), x-y+t(\nabla f(y) - \nabla f(x))\rangle$, thus $f(y) - f(x) +\langle \nabla f(y), x- y\rangle + t\langle \nabla f(y) - \nabla f(x), \nabla f(y) - \nabla f(x)\rangle \le \frac{A}{2}t^2\lVert \nabla f(y) - \nabla f(x)\rVert^2.$ After that, adding $f(x) - f(y) - \langle \nabla f(y), x-y\rangle \le \frac{A}{2}\lVert y -x \rVert^2$ to each side, we obtain that $t\lVert \nabla f(y) - \nabla f(x)\rVert^2 \le \frac{A}{2}\lVert y -x \rVert^2 + \frac{A}{2}t^2\lVert \nabla f(y) - \nabla f(x)\rVert^2.$ Finally plug in $t \gets 1/A$ to derive that $\lVert \nabla f(y) - \nabla f(x)\rVert \le A \lVert y -x \rVert.$