1
$\begingroup$

Reading page 4-9 here, I cannot understand the minus and the direction of the inequality. $f$ is convex.

  1. Why it is required that $\nabla f_{0}(x)^{T}(y-x) \geq 0$, why not $\nabla f_{0}(x)^{T}(y-x) \leq 0$?
  2. Could I write the initial in the form of $$-\nabla f_{0}(x)^{T}(y-x) \leq 0$$ yes of course, thinking.

  3. Now I I still need some way to understand the first minus in the picture, thinking...

  4. Is the minus before the nabla because the normal vector of the hyperplane and its tangent must be perpendicular but if so $\left(-\nabla f_{0}(x)\right) \left(\nabla f_{0}(x)^{T}\right) = -1$? Like with 2D case?

The screenshot I am trying to understand

enter image description here

  • 0
    In the effort to make the question self-contained, could you add the statement of convexity of $f$ which is stated in the slides you point to ?2011-10-27
  • 0
    @Sasha: done, done.2011-10-27
  • 0
    ...btw this paper is part of the course [here](http://www.stanford.edu/class/ee364a/lectures.html), there are videos and such things.2011-11-17

1 Answers 1

5

The function $f_0$ is the function that is to be minimized (the objective function). All of the other $f_i$ are constraint functions. That's what makes $f_0$ special.

The point $x$ will be a minimizer if moving in the direction of any other point $y$ in the convex feasible region causes $f_0$ to increase or stay the same. In other words, the directional derivative of $f_0$ at $x$ in the direction of $y$ must be nonnegative. Mathematically, this is $\nabla f_0(x)^T \frac{y - x}{||y-x||} \geq 0$, which can be more simply expressed as $\nabla f_0(x)^T (y - x) \geq 0$ (since $||y-x|| > 0$).

Also, given the usual geometric interpretation of the dot product, the requirement $\nabla f_0(x)^T (y - x) \geq 0$ is equivalent to requiring that the cosine of the angle between $\nabla f_0(x)$ and $y-x$ be nonnegative. So the angle between $\nabla f_0(x)$ and $y-x$ must be between $-\pi/2$ and $\pi/2$, which means that the angle between $-\nabla f_0(x)$ and $y-x$ must be between $\pi/2$ and $3\pi/2$. Hence the requirement $\nabla f_0(x)^T (y - x) \geq 0$ is equivalent to the existence of a supporting hyperplane for $S$ at $x$ such that $-\nabla f_0(x)$ is perpendicular to the hyperplane and points in the opposite direction from $S$.

And hopefully that last paragraph explains the reason for the "$-$" sign in the picture, too.


Added, in response to the new (enumerated) questions:

  1. This is what the second paragraph in my original answer is addressing.
  2. Yes. Just multiply the original equation by $-1$.
  3. I think the third paragraph in my original answer addresses this.
  4. No. First, $\nabla f_{0}(x)^T$ is not the tangent vector to the hyperplane; it's the transpose of the vector $\nabla f_{0}(x)$. (For example, $\begin{bmatrix} 1 \\ 2 \end{bmatrix}^T = \begin{bmatrix} 1 & 2 \end{bmatrix}$.) Second, if you take two orthogonal vectors $z$ and $w$ and calculate the dot product $z^T w$ you get $z^T w = 0$, not $-1$. (Maybe you're thinking of the fact that in 2D the slopes of perpendicular lines multiply to give $-1$? The operation $z^T w$ is not multiplying slopes; it's calculating the dot product.)


Added, in response to the comments below:

  • The vector $y$ is not an arbitrary vector; it's a point in the set $S$. So $y-x$ gives the vector from the point $x$ to the point $y$.
  • We're minimizing $f_0$ because 1) that's the only way the picture makes sense, and 2) it says so on slide 4-6 in the link you give.
  • "Minimizing" here means that we're looking for the smallest value of the function $f_0(y)$ over all points $y$ in the set $S$. For example, if we were trying to find the smallest value of $f_0(y_1, y_2) = y_1^2 + y_2^2$ over the region $S$, where $S$ is the set $\{y_1,y_2\}$ satisfying $y_1^2 + y_2^2 \leq 4$, the minimum would occur at $(y_1,y_2) = (0,0)$. We're not trying to minimize the distance between $x$ and $y$. We're trying to minimize the function $f_0(y)$.
  • The text of Bazarra, et al, is talking about something different. (And maybe this, actually, is the primary source of your confusion?) They're discussing the problem of minimizing the distance between a point and a convex set. The slide from Boyd and Vandenberghe is discussing the problem of minimizing the value of a function $f_0$ over a convex set. So, if your primary goal here is to understand the argument in the Bazarra text better, don't look at the picture you posted. It's addressing a different problem. And if your primary goal here is to understand the picture you posted, don't refer to p. 50 in the Bazarra text. They're talking about something different. The picture on p. 129 of the Bazarra text (a different chapter!) is the one that corresponds to the situation in the picture you posted.
  • 0
    ...now I am getting this perhaps, $f_{0}(x)$ is the function to be maximized with the value x. We have basically here a form $\Delta x$ which we want to make as large as possible until you cannot make it, hence the function has the optimal value with respect to some $\geq_{K}$ where $K$ is some cone. Def. page 2-17 for general inequality, $x \leq_{K} y \Leftrightarrow y-x \in K, x <_{K} Y \Leftrightarrow y-x \in int (K)$.2011-10-27
  • 2
    In the example $f_0$ is being minimized. Otherwise, the signs would be flipped.2011-10-27
  • 0
    Does this answer your (now) enumerated questions? If not, let me know, and I'll update it when I have the time. (Unfortunately, I don't have that time right now.)2011-10-27
  • 0
    ...small thing, I am wondering the vector $y$ (can it be arbitrary vector?) and the point 4, relationship to 2D case. We know that the multiplication of tangents (and inverse of the other) must be -1 if they are perpendicular (at least in 2D, is it here so?)2011-10-27
  • 0
    How did you see that is minimized? Is $x$ now the minimal value or the maximal value in the photo? I find it hard to say anything about maximum and minimum here, I can only look it with boundary point-finding or maximal/minimal-finding. What does the "to be minimized" really mean?2011-10-27
  • 0
    My book states $\bar{x}$ is the minimizing point if and only if $(y -\bar{x})^{t}(x-\bar{x})\leq 0$ for all $x \in S$ (p.50, Non-linear programming Theory and Algorithms, Bazaraa). $x \in S$ and $y \not \in S$, this is apparently same here so got the $y$? Disjoint? In the example of the question, $y \not \in X$?2011-10-27
  • 0
    ...still wondering what the terms "minimizing" and "maximizing" here means, wait -- maximizing means to maximize the distance from the vector $y \not \in X$ to the vector $x \in X$ along the normal vector $-\nabla f_{0}(x)$? While minimized means to minimize the distance from $x$ to $y$ along the $-\nabla f_{0}(x)$ but how they they are really different? Suppose y is on the right side, is it somehow different to the situation with $y$ on the left side, thinking... and even then I am unsure whether "minimum/maximum" really exist, the S must be compact at least, other conditions?2011-10-27
  • 0
    ...even when re-read it, it states `"x is optimal if..."`, I understand this that $x$ can be maximum/minimum/maximal/minimal, any of them with this criteria...2011-10-27