31
$\begingroup$

A method called "Robust PCA" solves the matrix decomposition problem

$L^*, S^* = \arg \min_{L, S} \|L\|_* + \|S\|_1 \quad \text{s.t. } L + S = X$

as a surrogate for the actual problem

$L^*, S^* = \arg \min_{L, S} rank(L) + \|S\|_0 \quad \text{s.t. } L + S = X,$ i.e. the actual goal is to decompose the data matrix $X$ into a low-rank signal matrix $L$ and a sparse noise matrix $S$. In this context: why is the nuclear norm a good approximation for the rank of a matrix? I can think of matrices with low nuclear norm but high rank and vice-versa. Is there any intuition one can appeal to?

  • 0
    @LeonidKovalev: Oops. Of course I meant $\min$.2012-06-29

4 Answers 4

20

Why does compressed sensing work? Because the $\ell_1$ ball in high dimensions is extremely "pointy" -- the extreme values of a linear function on this ball are very likely to be attained on the faces of low dimensions, those that consist of sparse vectors. When applied to matrices, the sparseness of the set of eigenvalues means low rank, as @mrig wrote before me.

  • 0
    I think for linear objective functions, there is a proofs saying optimum happens on vertices. But what if the objective is nonlinear? Does minimizing a convex nonlinear loss of a param + nuclear norm of the param has anything beyond pure wish to be low rank? I think the answer is yes, but can not see the reason.2016-08-11
12

The nuclear norm can be thought of as a convex relaxation of the number of non-zero eigenvalues (i.e. the rank).

9

A nuclear norm of a matrix is equivalent to the L1-norm of the vector of its eigenvalues. Thus, you are injecting sparsity to the vector of eigenvalues. Essentially, this sparsity means you are reducing the rank of the original matrix.

  • 2
    He means the vector of eigenvalues not an eigenvector. Fewer non-zero eigenvalues corresponds to lower rank.2018-05-08
8

To be accurate, it has been shown that the $\ell_1$ norm is the convex envelope of the $\| \cdot \|_0$ pseudo-norm while the nuclear norm is the convex envelope of the rank.

As a reminder, the convex envelope is the tightest convex surrogate of a function. An important property is that a function and its convex envelope have the same global minimizer.