1
$\begingroup$

Given a matrix $A \in \mathbb{R}^{n×m}$, and its pseudoinverse $A^{\dagger} \in \mathbb{R}^{m×n}$, how can I calculate the following gradient:

$\frac{\partial \Vert {A}^{\dagger} \Vert_1 }{\partial A}$

where $\Vert \cdot \Vert_1$ is the entrywise 1-norm: $\vert\vert{ A}\vert\vert _1=\sum_{i=1}^n\sum_{j=1}^m \vert a_{ij}\vert $ ?

I have tried searching trough identities in the differential form, such as:

  • $d(|A|) = |A|\operatorname {tr} (A ^{-1}d{A} )$ (which seems to be only defined for $A \in \mathbb{R}^{n×n}$ ?)
  • $dy=\operatorname {tr} (B \,dA ) \iff \frac{dy}{dA} = B $

and use them to derive some results, but had no success when checking my expressions against numerical gradient.

1 Answers 1

2

In the following, the scalar function $$\eqalign{ \operatorname{sign}(x) &= \begin{cases} +1 &\text{if }(x\ge 0) \\ -1 & \text{otherwise}\end{cases} \cr }$$ will be applied element-wise for matrix arguments.

Starting with the well-known result for the differential of the pseudoinverse $$\eqalign{ G &= A^{\dagger} \cr dG &= GG^T\,dA^T\,(I-AG) + (I-GA)\,dA^T\,G^T G - G\,dA\,G \cr }$$

First write the L1-norm in terms of the Frobenius product (denoted by a colon) and the sign function. Then finding the differential and gradient is straightforward. $$\eqalign{ L &= \operatorname{sign}(G):G \cr &= S:G \cr\cr dL &= S:dG \cr &= S:(GG^T\,dA^T\,(I-AG) + (I-GA)\,dA^T\,G'G - G\,dA\,G) \cr &= S:GG^T\,dA^T\,(I-AG) + S:(I-GA)\,dA^T\,G^T G - S:G\,dA\,G \cr &= (I-AG)S^TGG^T:dA + G'GS^T(I-GA):dA - G^TSG^T:dA \cr &= ((I-AG)S^TGG^T + G^T GS^T(I-GA) - G^TSG^T):dA \cr \cr \frac{\partial L}{\partial A} &= (I-AG)S^TGG^T + G^T GS^T(I-GA) - G^TSG^T \cr\cr }$$ If you are unfamiliar with the Frobenius product, it's an infix operator which is equivalent to the trace $$A:B = \operatorname{tr}(A^TB)$$

  • 0
    Thak you. It seams I would have had more luck if I did not miss: $\operatorname {tr} (X^{\mathrm {T} }Y)=\sum _{i,j}X_{ij}Y_{ij}$2017-01-26
  • 0
    Numerical gradient chacks out as well.2017-01-26