0
$\begingroup$

How to prove that

$\frac{\partial \mathrm{Trace}(ABA^T)}{\partial A}= AB+AB^T$

if $A$,$B$ are square matrices. Can I write the following:

Let $C=BA^T$. Then,

$\frac {\partial \mathrm{Trace}(AC)}{\partial A}+A\frac {\partial \mathrm{Trace}(C)}{\partial A} = C^T + AB = AB^T + AB$

and what is the answer of $\frac{\partial \mathrm{Trace}(A^TBA)}{\partial A}$?

Thank you~

  • 0
    Your notation being highly non standard, you should explain the definition of the object $\partial\varphi(A)/\partial A$ for $\varphi:M_{n\times n}(\mathbb{R})\to\mathbb{R}$ and $A$ in $M_{n\times n}(\mathbb{R})$. Or read this: http://en.wikipedia.org/wiki/Differential_(infinitesimal)#Differentials_as_linear_maps2011-06-18

1 Answers 1

2

Just add the indices. In the formulae above, the doubly repeated indices are automatically summed over, using the Einstein sum rule. Note that ${\rm Tr} M = M_{ii}$ and $(MN)_{ij} = M_{ik}N_{kj}$.

The first derivative of the trace is $ \frac{\partial A_{ab}B_{bc}A^T_{ca} }{\partial A_{ij}} = \frac{\partial A_{ab}B_{bc}A_{ac} }{\partial A_{ij}} =$ $= \delta_{ai}\delta_{bj} B_{bc} A_{ac} + A_{ab} B_{bc} \delta_{ai}\delta_{cj} = B_{jc}A_{ic} + A_{ib}B_{bj} = (AB^T)_{ij} + (AB)_{ij} $ so by removing the indices $ij$ again, you see that the "matrix derivative" is $AB^T+AB$. Similarly, $ \frac{\partial A^T_{ab}B_{bc}A_{ca} }{\partial A_{ij}} = \frac{\partial A_{ba}B_{bc}A_{ca} }{\partial A_{ij}} = \delta_{bi}\delta_{aj} B_{bc} A_{ca} + A_{ba} B_{bc} \delta_{ci}\delta_{aj} =$ $ = B_{ic}A_{cj} + A_{bj}B_{bi} = (BA)_{ij} + (B^T A)_{ij} $ so the second result is $BA+B^TA$.