3
$\begingroup$

I am trying to minimize convex objective $f(X)$, for matrix $X$ s.t. $X\ge 0$ component-wise, and $X1^T = 1^T$.

I want to use projected gradient descent. However, I only know how to project on feasible set of a single constraint, but not both at the same time (i.e. their intersection).

But projecting on boundary of one feasible set, may bring the solution out of the feasible set for the other one. What is the standard method to overcome this?

1 Answers 1

3

Projecting onto the simplex is straightforward, solutions crop up from time to time. Here is an example: C. Michelot, "A finite algorithm for finding the projection of a point onto the canonical simplex of $\mathbb{R}^n$". JOTA, 50(1):195–200, 1986.

It is also a simple linear programming problem.

This is an elaboration that belongs in the comments but is too long:

Perhaps I am missing something here, but from what you have above, my understanding is that if you write $X^T = \begin{bmatrix}x_1 & \cdots & x_n \end{bmatrix}$, then you want $[x_i]_j \geq 0$ for all $i,j$, and $\sum_j [x_i]_j = 1$, for all $i$. If you let $\Sigma = \{\sigma \in \mathbb{R}^n | \sigma_i \geq 0,\ \sum \sigma_i = 1 \} $, then the space of matrices $X$ satisfying the constraints is basically $\Sigma^n$.

So, if you are at some point $X$ and have a direction $\Delta$ and some step size $\lambda$, then you are trying to project $X+\lambda \Delta$ onto the feasible set. You would do so by projecting each row of $X+\lambda \Delta$ separately onto $\Sigma$ and then recombining to get the projected step.

  • 0
    But here we are not interested in projecting on the whole simplex. Instead, we are interested in projecting to the intersection of a simplex and the non-negative orthant.2012-07-19
  • 0
    Each row of $X$ is a member of a simplex. Project the corresponding members of the gradient onto the corresponding simplex, then combine.2012-07-19
  • 0
    I do not see how the other constraint $X\ge 0$ is satisfied here?2012-07-20
  • 0
    Please see my elaboration above. If I am missing something, please explain.2012-07-20
  • 0
    Let $\Sigma_i = \{\sigma \in \mathbb{R}^n | \sigma_i \geq 0,\ \sum \sigma_i = 1 \}$. My problem is that I do not know how to project on $\Sigma_i$. I know how to project on $ \{\sigma \in \mathbb{R}^n | \sigma_i \geq 0 \}$ and how to project on $ \{\sigma \in \mathbb{R}^n | \sum \sigma_i = 1 \}$; but not on $\Sigma_i$.2012-07-20
  • 0
    Look at the paper I mentioned above. The idea is straightforward, you project onto the affine hull of a set of points, then eliminate components with negative barycentric coordinates. Then repeat on the smaller space.2012-07-20
  • 0
    Thanks very much. Now, I got it.2012-07-20
  • 0
    @copper.hat: I am trying to do the same thing and I would appreciate if you could help. In the paper you mentioned in your answer (5th page/page number 199) the first projection $P_{V_I}(x)$ is computed as $\tilde{x_i} = x_i - \frac{\Big(\sum_i x_i -1\Big)}{n_I}$ where $n_I$ is the number of zero elements in $X$. But what happens when you have no zero elements in $X$? I am confused because it'll be a division by $0$. Plus the author states that $n_I$ often decreases by several units which means it'll be eventually $0$.2016-11-08
  • 0
    @user2987: I haven't seen the paper in decades, and don't have easy access. However, I suspect that $n_I$ is not the number of zero elements, but the dimension of the space containing the non zero elements.2016-11-08
  • 0
    @copper.hat: Thanks a lot for replying.2016-11-08
  • 1
    @user2987: If I recall, the algorithm was very simple, project onto the affine hull of the simplex and then zero the coordinates with negative barycentric coordinates. Then continue with the affine hull of non negative barycentric coordinates.2016-11-08
  • 0
    @user2987: Decades ago I used (my) version of the algorithm for some other work and decided to submit a note about it, but I found Michelot's paper when doing background checks :-(.2016-11-08
  • 0
    @copper.hat: Please forgive my misunderstanding apparently I am not very skilled in math. Could you please explain explicitly how to project onto the affine hull of the simplex? Is it by computing the equation that I wrote in my above comment that contains $n_I$. Then I should zero out negative values. Many thanks again for your reply2016-11-08
  • 0
    @user2987: Sorry, I don't have time to do that at the moment. I will try if I get a chance later.2016-11-08