1
$\begingroup$

I am going through some slides online https://stanford.edu/class/ee364b/lectures/monotone_slides.pdf

I find it interesting that monotone operators are almost exclusively defined as relations $R$ where for any $x,y \in \mathbb{R}^n$

$$\langle u - v| x -y \rangle \geq 0, \text{ with } (x,u), (y,v) \in R$$

My concern is two fold: first, in introductory optimization you never encounter any relation, second, it is obvious that many single valued functions will satisfy the above inequality. For example, take $f(x) = x$, then trivially

$$\langle f(x) - f(y)| x -y \rangle \geq 0$$


What is more amazing is that in the notes later on they define the resolvent to be $$R_s = (I + \lambda F)^{-1}$$ An explicit example is when $F = \partial f$ the subgradient. But suppose $f$ is smooth and convex, then $\partial f = \{\nabla f\}$. But here $\nabla f$ is going to be a single valued map. How was $\nabla f$ extended into a relation? How do I make sense of $(I + \lambda \nabla f)^{-1}$

Can someone who has actually taken a course on this explain how to resolve this ambiguity?


Given a single valued function $f:\mathbb{R}^n \to \mathbb{R}^n$ such that $$\langle f(x) - f(y)| x -y \rangle \geq 0$$ is held for all $x,y\in \mathbb{R}^n$. Is $f$ a monotone operator? If so, why need to define monotone operator to be a relation? How to make the single valued function $f$ into a relation /monotone operator rigorously i.e. do you just squint your eyes and declare now $f(x) = \{(x,f(x))|x\in \mathbb{R}^n\}$ is there anything more than this?

Any reference helps!

1 Answers 1

2

I prefer to define a monotone operator on $\mathbb R^n$ to be a set-valued function $T:\mathbb R^n \to S$, where $S$ is the collection of all subsets of $\mathbb R^n$, such that if $y_1 \in T(x_1)$ and $y_2 \in T(x_2)$ then $$ \langle y_1 - y_2, x_1 - x_2 \rangle \geq 0. $$ The graph of a set-valued function $T$ is $\{(x,y) \mid y \in T(x) \}$, which is a relation. Some authors define a monotone operator to be the graph itself (which is a relation), but to me it seems more clear to think of a monotone operator as being a set-valued function. (Reason: the subdifferential $\partial f$ is a convex function $f$ is supposed to be the prototypical example of a monotone operator, and in my mind $\partial f$ is a set-valued function, not a relation.)

Note that Bauschke and Combettes, one of the standard references for monotone operator theory, defines a monotone operator to be a function $A: \mathcal H \to 2^{\mathcal H}$ which satisfies the monotonicity property. (Here $\mathcal H$ is a Hilbert space.)

If $f$ is convex and differentiable on $\mathbb R^n$ then it can be shown that $\partial f(x) = \{\nabla f(x)\}$ for all $x \in \mathbb R^n$. So $\partial f(x)$ is a singleton for all $x$. It can also be shown that $$ \langle \nabla f(x_1) - \nabla f(x_2), x_1 - x_2 \rangle \geq 0 $$ for all $x_1,x_2 \in \mathbb R^n$. However, if we are being very precise, then it would not be technically correct to say that $\nabla f$ is a monotone operator, because $\nabla f$ is not a set-valued function. It is certainly true, though, that $\partial f$ is a monotone operator. And most people will be a little sloppy and say that $\nabla f$ is a monotone operator, in this situation.

You also mentioned the function $g:\mathbb R^n \to \mathbb R^n$ defined by $g(x) = x$. As you mentioned, $$ \langle g(x_1) - g(x_2), x_1 - x_2 \rangle \geq 0 $$ for all $x_1,x_2 \in \mathbb R^n$. Again, if we are being very precise, it would not be technically correct to say that $g$ is a monotone operator, because $g$ is not a set-valued function. However, the operator defined by $\tilde g(x) = \{ g(x) \}$ is certainly a monotone operator. Most people will just be a little sloppy and state that $g$ itself is a monotone operator.

You also asked about the meaning of $(I + \lambda \nabla f)^{-1}$. It should really be $(I + \lambda \partial f)^{-1}$. Note that $I + \lambda \partial f$ is a set-valued function. And any set-valued function $T$ has an inverse $T^{-1}$ defined by $$ T^{-1}(y) = \{x \mid y \in T(x) \}. $$ So $T^{-1}$ is also a set valued function. Thus, strictly speaking, $(I + \lambda \partial f)^{-1}$ is a set-valued function. But, there is an important fact: If the set-valued operator $T$ is maximal monotone, then $(I + \lambda T)^{-1}(y)$ is a singleton for all $y$. For this reason, we can view $(I + \lambda T)^{-1}$ as being a function that takes a vector as input and returns a single vector as output. This is a bit sloppy and not perfectly correct, but it's very common.

Moreover, if $f$ is a proper closed convex function, then $\partial f$ is a maximal monotone operator, so $(I + \lambda \partial f)^{-1}(y)$ is a singleton for all $y$, and so we can view $(I + \lambda \partial f)^{-1}$ as a function from $\mathbb R^n$ to $\mathbb R^n$. (It is not strictly correct, but it is harmless.)