1
$\begingroup$

I need to calculate $\frac{\partial f(X)}{\partial X}$ for $f(X)=tr \big[ C(X^\top BX)^{-1}\big]$. Where $\{X,B,C\}$ are matrices.

I tried to assume it as $g(U)=tr(CU^{-1})$ when $U=X^\top BX$, then using the chain rule $$\frac{\partial f(X)}{\partial X}=tr\Big(\frac{\partial g(U)}{\partial U}. \frac{\partial U}{\partial X_{ij}} \Big)$$.

I think the first part should be $\frac{\partial g(U)}{\partial U}=-U^{-1}CU^{-1}$, but i'm not sure about how to derive the 2nd part.

1 Answers 1

2

Define the matrices $$\eqalign{ M &= X^TBX \cr P &= M^{-1}CM^{-1} \cr }$$ and write the function in terms of $M$ and the inner product (denoted by a colon) $$\eqalign{ f &= C^T:M^{-1} \cr\cr }$$ Now find the differential and gradient $$\eqalign{ df &= -C^T:M^{-1}\,dM\,M^{-1} \cr &= -M^{-T}C^TM^{-T}:dM \cr &= -P^T:dM \cr &= -P^T:(dX^TBX+X^TB\,dX) \cr &= -P^TX^TB^T:dX^T \,\,-\,\, B^TXP^T:dX \cr &= -\big(BXP + B^TXP^T\big):dX \cr \cr \frac{\partial f}{\partial X} &= -\big(BXP + B^TXP^T\big) \cr &= -BXM^{-1}CM^{-1} - B^TXM^{-T}C^TM^{-T} \cr \cr }$$ Knowing the trace equivalent of the inner product
$$\eqalign{ A:B &= {\rm tr}(A^TB) \cr }$$ can be useful for rearranging terms, e.g. $$\eqalign{ A:BC &= B^TA:C \cr&= AC^T:B \cr&= etc. }$$