0
$\begingroup$

I'm trying to get a closed form solution for this matrix equality (it arose as the subgradient of a "group elastic net" objective).

$ Z_1\beta + z_2 + \lambda_1\beta + \lambda_2\frac{\beta}{\|\beta\|_2} = 0$

Here $Z_1$ is a positive semi-definite matrix, $z_2$ and $\beta$ are vectors, and $\lambda_1$ and $\lambda_2$ are positive scalars. The objective is to solve for $\beta$. If $\beta$ is one-dimensional, the problem becomes quite easy since $\frac{\beta}{\|\beta\|_2}$ reduces to the sign function, and $Z_1\beta$ is proportional to $\beta$. But none of these hold in higher dimensions. I tried an approach with projecting $\beta$ and $z_2$ to the eigenspace of $Z_1$ but that doesn't make much progress since you end up with a root sum square of the eigenvalues.

It's possible that there's no closed form solution here, but I don't see an argument for that either.

  • 0
    IMHO, there is no close form solution due to the normalization operation $\beta \to \frac{\beta}{\|\beta\|_2}$ and also to the positivity constraints. I would advise you to take numerical methods in order to "play" a little with the problem, see whether there are, in some cases, 1,0 or an infinity of solutions...2017-01-25
  • 1
    Thanks for the reply. Actually this arose from trying to do coordinate descent which would need a fast optimization in the "inner loop". I was hoping that a closed form solution to this would make the overall optimization faster than my current method, proximal gradient descent. But if I have to use numerical methods for this problem, I don't think I will gain any advantage by doing coordinate descent. So I guess I'll stick to prox gradient descent for now.2017-01-25

1 Answers 1

0

If $z_2=0$, clearly $\beta=0$ is a solution. Suppose $z_2$ is nonzero. Orthogonally diagonalise $Z_1+\lambda_1I$ as $QDQ^T$. Let $u=\frac{Q^T\beta}{\|\beta\|}=\frac{Q^T\beta}{\|Q^T\beta\|},\ x=\|\beta\|=\|Q^T\beta\|$ and $v=-Q^Tz_2$. We may rewrite the equation as $$ (xD+\lambda_2I)u = v,\quad \|u\|=1,\ x\ge0.\tag{1} $$ As $f(x)=\|(xD+\lambda_2I)u\|$ is increasing in $x$ and it approaches infinity when $x$ tends to infinity, $(1)$ has a root if and only if $f(0)\leq\|v\|$, i.e. if and only if $\lambda_2\leq\|v\|=\|z_2\|$.

In case $\lambda_2\leq\|z_2\|$, equality holds in $(1)$ if and only if $(xD+\lambda_2I)^{-1}v$ has unit norm. Hence the roots of $(1)$ are given by the nonnegative roots of the equation $$ \sum_{i=1}^n\frac{v_i^2}{(d_ix+\lambda_2)^2}=1,\tag{2} $$ (where $n$ is the dimension of the vector space) which further reduces to the degree-$2n$ polynomial equation $$ \prod_{i=1}^n(d_ix+\lambda_i)^2-\sum_{i=1}^nv_i^2\prod_{j\neq i}(d_jx+\lambda_j)^2=0,\quad x\ge0.\tag{3} $$ In general, high-degree polynomial equations can only be solved numerically.