4
$\begingroup$

Suppose I have a function $f(x) : \mathbb R^n \to \mathbb R$ that is the sum of a given strictly convex function $g : \mathbb R \to \mathbb R$ in a single variable, i.e. $f(x) = g(x_1) + g(x_2) + \dots + g(x_n)$, where all the functions $g$ are identical.

Since $f$ is in turn strictly convex, its unique minimum can be found using any standard method for convex programming. However, because $f$ is separable, the minimum can also be found by determining the minimum of $g$ for each variable, repeatedly applying any method for single-variable root finding (e.g. bisection).

Now, I believe it is true that if I want to find the minimum of $f$ constrained to an axis-aligned "hyperrectangle" (or in other words, with given upper and lower bounds $u$ and $l$ on $x$), I can simply find the unconstrained minimum as above, and then project the minimum, component-wise, onto the rectangle (i.e., onto the intervals $[l_i, u_i]$).

My question is: what happens in the case of more general linear constraints (polytopes)? I believe that for general convex functions $f$, not much can be said. In particular, simply projecting the unconstrained minimum onto the polytope doesn't work in general (see a conjecture on norms and convex functions over polytopes).

However, when $f$ has the special form above, one natural guess would be that $l_1$ projection of the unconstrained minimum onto the polytope would provide the optimal constrained minimum. Is this actually true?

1 Answers 1

2

Unfortunately, your conjecture that the $L_1$ projection of the unconstrained minimum of a function $f$ onto a polytope is the minimum of $f$ on the polytope isn't true, even for the restricted class of functions $f$ that you consider. Here is a counterexample.

Let $f(x,y) = x^2+y^2$. So $f(x,y) = g(x) + g(y)$, where $g(x) = x^2$. The function $g$ is strictly convex. The minimizer of $f$ is $(0,0)$.

Let the polytope $P$ be defined by the inequalities $y \leq 1/2x - 1, y \geq -2, 0 \leq x \leq 2$. In the image below, the shaded region denotes the polytope $P$.

Since we're in the fourth quadrant, the $L_1$ distance from $(0,0)$ to $(x,y)$ is $x - y$, and so the point in $P$ that minimizes the $L_1$ distance from $(0,0)$ is the point in $P$ for which the line $y = x - c$ intersects $P$ and has the smallest value of $c$. To find this, we can imagine slowly increasing the value of $c$ from $0$ until $y=x-c$ intersects $P$. That intersection occurs when $c = 1$: As we can see from the graph below, the point in $P$ that minimizes the $L_1$ distance is $(0,-1)$.

On the other hand, the point in $P$ with the smallest value of $f$ is the one that minimizes $x^2+y^2$. So we can imagine slowly increasing the value of $c$ from $0$ in the equation $x^2+y^2=c$ until $x^2+y^2=c$ intersects $P$. That's harder to eyeball from the graph, but the point of intersection occurs when $c = 20/25$, and the point in $P$ that minimizes $f(x,y) = x^2+y^2$ is $(2/5,-4/5)$.

Thus the $l_1$ projection of $(0,0)$ onto $P$ doesn't give the optimal value of $f$ on $P$, which was your conjecture.

The graph below should help with the visualization. The points where the line ($L_1$ minimal value) and the circle ($f$ minimal value) intersect $P$ are not the same.

enter image description here

(From one point of view, perhaps this is to be expected: This counterexample uses the $L_1$ and $L_2$ distances. Since these are different measures, they probably shouldn't be expected to coincide on every polytope.)