1
$\begingroup$

The function is

$$f(x) = \| x x^T - V \|_*$$

where $\| \cdot \|_*$ denotes the nuclear norm and $V$ is a given matrix. $x$ is a vector. Please tell me how to differentiate $f(x)$. And, if it is possible, please show me how to compute the 2nd derivative of $f(x)$.

  • 0
    Could you define *nuclear norm*? I never heard that term before.2017-01-02
  • 0
    @RodrigodeAzevedo Oh I forgot, V is p.s.d2017-01-03

1 Answers 1

2

Define a new matrix variable $$M=xx^T-V$$Then find the differential of the function in terms of this new variable $$\eqalign{ f &= \operatorname{tr}\sqrt{M^TM} \cr \cr df &= \frac{1}{2}(M^TM)^{-1/2}:d(M^TM) \cr &= \frac{1}{2}(M^TM)^{-1/2}:(dM^TM+M^TdM) \cr &= (M^TM)^{-1/2}:M^TdM \cr &= M(M^TM)^{-1/2}:dM \cr &= M(M^TM)^{-1/2}:d(xx^T) \cr &= M(M^TM)^{-1/2}:(dx\,x^T+x\,dx^T) \cr &= \big(M(M^TM)^{-1/2} + (M^TM)^{-1/2}M^T\big)\,x:dx \cr \cr \frac{\partial f}{\partial x} &= \big(M(M^TM)^{-1/2} + (M^TM)^{-1/2}M^T\big)\,x \cr\cr }$$ This is the gradient. To find the hessian you must differentiate again wrt $x$. But it's going to be very messy since each term $M$ contains two $x$'s inside of it.

And we can't use the "trace trick" again $$\operatorname{tr}(f(X))= f^\prime(X^T):dX$$ since there are no traces left in the gradient.

If there is other information that would simplify the problem (e.g. $V$ is symmetric), then you might be able to find an explicit formula for the hessian.

If you want the hessian in order to use something like Newton's method, I would suggest that you try a gradient-based method instead.

  • 0
    great thanks. By the way, what does colon ':' mean?2017-01-03
  • 0
    It is the double-dot (aka Frobenius) product. In index notation $$A:B = A_{ij}\,B_{ij}$$2017-01-03