I have the following optimization problem:
$$\min_{x} \sum a_i{x_i}^2 \\ s.t. x_i \ge 0 \\ \sum x_i=1 \\ \|x\|_0 Where $a_i > 0$. Obviously if i remove the constraints the minimum to the above would be at $X=0$. Also it it possible to solve the above using constrained quadratic programming. by $\|x\|_0 Experimental Answer: A: I noticed if i sort $x$ according to the values $a_i$ in the ascending order, then the first $T$ choice of $x$ would be the optimal choice when we want to have $card(x)=T$. For example, imagine $f(x)=2x_1^2+x_2^2+3x_3^2+0.5x_4^2$ and we want to choose only 2 elements of $x$, then the optimal set is $\{x_4,x_2\}$ and we can solve the constrained optimization for $g(x)=x_2^2+0.5x_4^2$ and having rest of $x$ equal to zero. B: Also i noticed the optimal answer is in reverse relation to the ratio of $a_i$ elements, meaning that $x_i/x_j=a_j/a_i$ which will lead to a closed form solution. Question:
Do you have any mathematical way to support A and B in the above?
Sparse L0 solution to constrained quadratic programming?
-
1Such a constraint is sometimes called a cardinality constraint. Usually we handle this by adding binary variables (indicating whether a variable is active or nonzero). Cardinality constrained portfolio optimization problems form a well known application. – 2017-01-30
-
0Regarding my comment in the previous thread, @LinAlg adequately addressed it. If you *require* the cardinality to be 1, then picking the smallest $a_i$, as I claimed, is correct. Your supposed counterexample had a cardinality of 2, so of course it would be "better". – 2017-01-30
-
0@MichaelGrant: Maybe i had a miss-interpretation about that comment! :-) – 2017-01-30
-
0All good. Your rewrite is certainly a good one! – 2017-01-30
2 Answers
As for A, I think the statement is trivial.
Given the answer to $A$, the answer to $B$ is found with the KKT conditions for the problem. Consider $$\min_x \{ \sum a_i x_i^2 : \sum x_i \geq 1, x_i \geq 0 \}$$ where the index $i$ is restricted to the $T$ elements from A. I replaced the equality with an inequality since that simplifies the solution ($\lambda \geq 0$) but clearly does not change the optimal solution. The Lagrangian is: $$L(x) = \sum a_i x_i^2 - \lambda(\sum x_i - 1) - \sum \mu_i x_i$$ The KKT conditions are therefore: $$2a_i x_i - \lambda - \mu_i = 0 \; \forall i, \mu_i x_i = 0 \; \forall i, x \geq 0, \lambda \geq 0, \mu \geq 0, \sum x_i = 1$$ Suppose that for some $i$, $x_i=0$, then $\lambda+\mu_i = 0$, so $\lambda=0$. Then, for all other $i$, $2a_i x_i = \mu_i$, and since $x_i \mu_i = 0$, $x_i = 0$ for all $I$, contradicting $\sum x_i - 1$. So, $x_i > 0$ for all $i$, and $\mu_i = 0$.
The first condition now states that $2a_i x_i$ is constant for all $i$, so $2a_i x_i = 2a_j x_j$, leading to $x_i / x_j = a_j / a_i$.
-
1How can we be sure you don't need a Lagrange multiplier for the equality constraint? I'm not quite convinced here. – 2017-01-30
-
1@MichaelGrant the equality constraint has multiplier $\lambda$, so I don't get your comment. Let me add a multiplier for the nonnegativity constraint though, I must have been confused by thinking about dualization. – 2017-01-30
-
0That's what I get for commenting before coffee – 2017-01-30
-
0Actually, $\lambda$ does not have to be nonnegative. It's attached to an equality constraint. So $\lambda + \mu_i = 0$ does not immediately imply that $\lambda = 0$. – 2017-01-30
-
1I think it's possible to prove, however, that the values of $x_i$ must be nonnegative, so the inequalities can be dropped. For instance, suppose one or more $x_i$ is negative; then setting $x_i = |x_i|/\sum_j |x_j|$ will yield a smaller objective, since $\sum_j |x_j| > 1$. – 2017-01-30
-
1@MichaelGrant seems like you just enjoyed your coffee :) I think it's easier to change the constraint to $\sum x_i \geq 1$. – 2017-01-30
I'm going to offer my comment as an answer because, frankly, I'm proud to have thought of it this early in the morning ;-)
It turns out that the nonnegativity constraints on $x$ are unnecessary. Suppose that we have an $x$ with one or more negative values, but $\sum_j x_j=1$. Then, since $|x_i|>x_i$ when $x_i<0$, we have $$\sum_j |x_j| > \sum_j x_j = 1.$$ Now consider the vector $y_i = |x_i| / \sum_j |x_j|$. By construction, $\sum_i y_i = 1$, so it too is a feasible solution to the problem. And $$\sum_i a_i y_i^2 = \sum_i a_i \frac{x_i^2}{(\sum_j |x_j|)^2} = \frac{\sum_i a_i x_i^2}{(\sum_j |x_j|)^2} < \sum_i a_i x_i^2.$$ So $y$ has a smaller objective value that $x$, which means $x$ cannot be optimal. That doesn't mean $y$ is optimal, mind you; it just means that $x$ cannot be.
So now we have established that the problem is equivalent to \begin{array}{ll} \text{minimize} & \sum_i a_i x_i^2 \\ \text{subject to} & \sum_i x_i = 1 \end{array} and you can proceed with LinAlg's answer by setting all $\mu_i$ to zero.
-
0I wish i could accept both as the answers, such a dilemma! But anyway, i really liked it! – 2017-01-30
-
0You are kind but it is just fine! – 2017-01-30