3
$\begingroup$

Let $f:\mathbb{R}^n\rightarrow\mathbb{R}$ be convex and $\nabla f$ lipschitz, i.e., $\|\nabla f(x)-\nabla f(y)\|\le L\|x-y\|$. Show that $\|\nabla f(x)-\nabla f(y)\|^2\le L[\nabla f(x)-\nabla f(y)]\cdot(x-y)$.

If $n=1$ it is easy since I just make use of the definition of the absolute value so I tried something similar for $n>1$ making use of the equality $$[\nabla f(x)-\nabla f(y)]\cdot(x-y)=\|\nabla f(x)-\nabla f(y)\|\|x-y\|\cos\theta$$

Then, since $f$ is convex and differentiable I know that $f(y)\ge f(x)+\nabla f(x)\cdot(y-x)$ and $[\nabla f(x)-\nabla f(y)]\cdot (x-y)\ge0$.

But I end up with $$L\cos\theta\ge\frac{\|\nabla f(x)-\nabla f(y)\|}{\|x-y\|}$$ wich I don't know if it's true.

  • 0
    Use $x_0=x, x_1, \ldots, x_n=y$ where each $x_k-x_{k-1}$ is a multiple of the $k$th standard basis vector. And use that $g$ is increasing2017-02-27
  • 0
    I've been trying that for a while but no results2017-02-28

2 Answers 2

2

I am sorry but I don't understand if KeD's answer is supposed to prove or disprove the inequality. The result is true but the proof is not easy. The one I know uses the Fenchel-Young inequality, which says that $$ f(x)+f^{\ast}(y)\geq x\cdot y $$ for all $x,y\in\mathbb{R}^{N}$, where $f^{\ast}$is the conjugate of $f$, that is, $f^{\ast}(y):=\sup\{z\cdot y-f(z):\,z\in\mathbb{R}^{N}\}$, $z\in \mathbb{R}^{N}$. Equality holds in the Fenchel-Young inequality if and only if $y\in\partial f(x)$, where $\partial f(x)$ is the subdifferential. Note that, since $f$ is differentiable, $\partial f(x)$ is the singleton $\{\nabla f(x)\}$, so equality holds in the Fenchel-Young inequality if and only if $y=\nabla f(x)$, that is, $f(x)+f^{\ast}(\nabla f(x))=x\cdot\nabla f(x)$.

Note that if $g(y)=\frac{1}{2}L\Vert y\Vert^{2}$, then $g^{\ast}(z)=\sup _{y\in\mathbb{R}^{N}}\{z\cdot y-\frac{1}{2}L\Vert y\Vert^{2}\}=\frac{1} {2L}\Vert z\Vert^{2}$.

By the fundamental theorem of calculus, applied to the function $t\mapsto f(x+t(y-x))$ we have \begin{align*} f(y)-f(x)-\nabla f(x)\cdot(y-x) & =\int_{0}^{1}(\nabla f(x+t(y-x))-\nabla f(x))\cdot(y-x)\,dt\\ & \leq L\Vert y-x\Vert^{2}\int_{0}^{1}t\,dt=\frac{1}{2}L\Vert y-x\Vert^{2}. \end{align*} Hence, by the equality case in the Fenchel-Young inequality \begin{align*} -f(y) & \geq-f(x)+\nabla f(x)\cdot x-\nabla f(x)\cdot y-\frac{1}{2}L\Vert y-x\Vert^{2}\\ & =f^{\ast}(\nabla f(x))-\nabla f(x)\cdot y-\frac{1}{2}L\Vert y-x\Vert^{2}. \end{align*} Using the definition of $f^{\ast}$ we have \begin{align*} f^{\ast}(z) & \geq z\cdot y-f(y)\\ & \geq z\cdot y+f^{\ast}(\nabla f(x))-\nabla f(x)\cdot y-\frac{1}{2}L\Vert y-x\Vert^{2}\\ & =f^{\ast}(\nabla f(x))+(z-\nabla f(x))\cdot x+(z-\nabla f(x))\cdot (y-x)-\frac{1}{2}L\Vert y-x\Vert^{2}. \end{align*} Since this holds for every $y$ we get \begin{align*} f^{\ast}(z) & \geq f^{\ast}(\nabla f(x))+(z-\nabla f(x))\cdot x+\sup _{y\in\mathbb{R}^{N}}\{(z-\nabla f(x))\cdot(y-x)-\frac{1}{2}L\Vert y-x\Vert^{2}\}\\ & =f^{\ast}(\nabla f(x))+(z-\nabla f(x))\cdot x+g^{\ast}(z-\nabla f(x))\\ & =f^{\ast}(\nabla f(x))+(z-\nabla f(x))\cdot x+\frac{1}{2L}\Vert z-\nabla f(x)\Vert^{2}. \end{align*} In particular, if $z=\nabla f(y)$ we get $$ f^{\ast}(\nabla f(y))\geq f^{\ast}(\nabla f(x))+(\nabla f(y)-\nabla f(x))\cdot x+\frac{1}{2L}\Vert\nabla f(y)-\nabla f(x)\Vert^{2}. $$ By interchanging $x$ and $y$ we get $$ f^{\ast}(\nabla f(x))\geq f^{\ast}(\nabla f(y))+(\nabla f(x)-\nabla f(y))\cdot y+\frac{1}{2L}\Vert\nabla f(y)-\nabla f(x)\Vert^{2}. $$ Adding these two inequalities gives $$ 0\geq(\nabla f(x)-\nabla f(y))\cdot(y-x)+\frac{1}{L}\Vert\nabla f(y)-\nabla f(x)\Vert^{2}, $$ that is% $$ (\nabla f(x)-\nabla f(y))\cdot(x-y)\geq\frac{1}{L}\Vert\nabla f(y)-\nabla f(x)\Vert^{2}. $$

4

We claim that the assertion above is false. Let $n = 3$, and fix a vector $v$ in $\mathbb{R}^3$. Let$$g(x) = x \times v,$$where $\times$ denotes the cross product. This $g$ satisfies your hypothesis. (For the "increasing" property,$$[g(x) - g(y)] \cdot (x - y) = 0,$$always.) But this $g$ doesn't satisfy your conclusion, for the left side can be positive, while the right side is always zero.

When I first saw this, my first reaction was "This is false!". Permit me to say a word about why. If it were true, then it would have to be the case that whenever $$[g(x) - g(y)] \cdot (x - y)$$vanishes (i.e. whenever the "increasing condition" is saturated), then$$g(x) = g(y).$$But a run of the mill $g$ isn't going to want to do this for you. (It was forced to satisfy the increasing condition, but nobody made it do so enthusiastically.) Indeed, the increasing condition doesn't really have the flavor of "increasing".

I might also add another, slightly suspicious, thing about the conjecture, i.e. what you wanted to show. (There is nothing wrong with it, but it has a slightly funny smell.) The increasing condition requires that there be given a vector space isomorphism between the vector space in which the $x$'s lie and that in which the $g(x)$'s lie. We normally wouldn't expect to have such a thing. (For example, we might expect a generalization in which these two vector spaces have different dimensions. But, given the formulation of the increasing condition, there is none.)

I was trying to solve another problem and that condition was enough to finish it, but I omitted one condition that I didn't think was needed. It was that $g(x)= \nabla f(x)$, $g$ is the derivative of $f$; that's why $x \times v$ doesn't satisfy the condition, it's not the gradient of a function.

This is a bit of a guess, but let me try out an assertion on you.

Let $f$ be a smooth function on $\mathbb{R}^n$, such that, for every pair of points, $x$ and $y$,$$[\nabla f(x) - \nabla f(y)] \cdot (x - y) \ge 0.\tag*{$(*)$}$$Then $f$ is a constant.

Here is the "proof" (not really a proof, but I think the details are fillable). Fix any curve $\gamma$ in $\mathbb{R}^n$. Then, for $x$ and $y$ "close" and on $\gamma$, $(*)$ says that $f$ is nondecreasing along $\gamma$. Now choose for $\gamma$ a closed curve. We conclude that $f$ is constant on this $\gamma$; and, since $\gamma$ can be any closed curve, that $f$ is constant everywhere.

(To make this into a real proof, you can either use the strategy suggested above, or consider the behavior of $f$ around various rectangles in $\mathbb{R}^n$.)

So, I am agreeing with your assertion.

The assertion you have done can't be true because the condition $(*)$ is needed for the function $f$ to be convex. Indeed, if $f$ is differentiable, then $f$ is convex if and only if$$f(x + y) \ge f(x) + \nabla f(x) \cdot y \implies [\nabla f(x) - \nabla f(y)] \cdot (x - y) \ge 0.$$I don't know if the converse of the latter is true so maybe the condition that the function $f$ is convex is needed.

Indeed, the assertion I gave you is false. A counterexample is$$f(x) = x \cdot x.$$So, let me try again. Let the two points, $x$ and $y$, become close together. Let $A$ be the symmetric, $n \times n$ matrix $\nabla^2 f$. Then the Lipschitz condition implies that $A$ is bounded by $L$, i.e.$$|Av| \le L|v|$$for every vector $v$. The increasing condition implies that $A$ is positive semidefinite, i.e.$$A vv \ge 0$$for every vector $v$.

Now let $x$ and $y$ not be close together. Then $\nabla f(x) - \nabla f(y)$ is given by the integral, along the straight line segment from $x$ to $y$ of $A\,\text{d}s$. Therefore,$$|\nabla f(x) - \nabla f(y)| \le \int |A\,\text{d}s| \le L\int |\text{d}s| = L|x - y|.$$That is, you get what you wanted to show.

I hope this is clear. (I've let off the indices$\ldots$) I haven't checked (but I guess it is probably true) that the conditions on $A$ imply the actual conditions on $f$. Also, this seems a little strange (therefore, suspicious), because I don't see we used the increasing condition here.

  • 0
    I have to apologize, you're right, I was trying to solve another problem and that condition was enough to finish it, but I omitted one condition that I didn't think was needed. It was that $g(x)=\nabla f(x)$, $g$ is the derivative of $f$; that's why $x\times v$ doesn't satisfy the condition, it's not the gradient of a function.2017-03-02
  • 0
    The assertion you have done can't be true because the condition $(∗)$ is needed for the function $f$ to be convex. Indeed, If $f$ is differentiable, then $f$ is convex if and only if $f(x+y)≥f(x)+∇f(x)⋅y⇒[∇f(x)−∇f(y)]⋅(x−y)≥0$. I don't know if the converse of the latter is true so maybe the condition that the function $f$ is convex is needed.2017-03-03
  • 0
    @GuadalupeAnimation I have updated my answer.2017-03-03
  • 0
    Thats true, but I'm curious about the proof with out using that $f$ is twice differenciable.2017-03-03