13
$\begingroup$

Could someone explain this equation?

$$ \frac{d \operatorname{tr}(AXB)}{d X} = BA $$

I understand that

$$ d\operatorname{tr}(AXB) = \operatorname{tr}(BA \; dX) $$

but I don't quite understand how to move $dX$ out of the trace.

3 Answers 3

5

Try expanding to linear order. This always eases the understanding:

$$\operatorname{tr}(A (X+dX)B)=A_{ij} (X_{jk}+dX_{jk})B_{ki})$$

where Einstein's summation rule is used. Substracting $\operatorname{tr}(AXB)$ you get

$$\begin{align} d\operatorname{tr}(AXB)&=\operatorname{tr}(A(X+dX)B)-\operatorname{tr}(AXB)\\&=A_{ij} dX_{jk}B_{ki}=\underbrace{B_{ki}A_{ij}}_{=(BA)_{kj}} \; dX_{jk} \end{align}$$

  • 0
    When you say 'expanding to linear order', do you mean write out the actual matrix element summations? I had to do that manually to prove $$\operatorname{tr}(A (X+dX)B)=A_{ij} (X_{jk}+dX_{jk})B_{ki})$$ to myself. Is that something that's easily derived without explicitly expanding the matrices or is this something people generally just memorize about traces?2012-03-11
  • 0
    What I meant was writing $X+dX$ and work with that, keeping only stuff that is linear in $dX$. In this case, everything was linear to begin with, so my comment was a bit misleading. But it is the right way to work when deriving more complicated tensorial derivatives like, say, $$\frac{\partial \det (A)}{\partial A}=\det(A) \left(A^{-1}\right)^T$$2012-03-12
2

The notation is quite misleading (at least for me).

Hint:

Does it make sense that $$\frac{\partial}{\partial X_{mn}} \mathop{\rm tr} (A X B) = (B A)_{nm}?$$

More information: $$\frac{\partial}{\partial X_{mn}} \mathop{\rm tr} (A X B) = \frac{\partial}{\partial X_{mn}} \sum_{jkl} A_{jk} X_{kl} B_{lj} = \sum_{jkl} A_{jk} \delta_{km} \delta_{nl} B_{lj} = \sum_{j} A_{jm} B_{nj} =(B A)_{nm}. $$

  • 0
    That does make sense if I assume the first equation of $\frac{dtr(AXB)}{dX}$. But I'm not sure how to get to the first equation using the second equation for $dtr(AXB)$.2012-03-11
  • 0
    @ChrisD: I added a line explaining how to use my hint.2012-03-11
  • 0
    I see what you mean by misleading notation. The resulting matrix is indexed by the transpose of the matrix you differentiate by, $X$. Therefore, the full matrix solution $BA$ is the transpose of the element by element solution $(BA)_{nm}$. Thanks for the help.2012-03-11
0

The other answers are correct, but I feel like they missed the point. Arguments that take a basis to prove a result independent of bases should be approached with caution.


First of all, according to the Matrix Cookbook, the formula is $$ \frac{\mathrm{tr}(AXB)}{dX} = (BA)^T,$$ not the one given in your question.

What's confusing about this presentation is that $f (X) = \mathrm{tr}(AXB)$ is a linear map, so it's derivative (=linear approximation) is itself.

So in fact, the statement should read $$ f(X) = \mathrm{tr}(AXB) = (BA)^T,$$ which is clearly wrong.

But consider the Frobenius inner product on $\mathrm{Mat}(m, n)$. For $U, V \in \mathrm{Mat}(m, n)$:

$$\langle U, V \rangle = \mathrm{tr}(U^T V).$$

By the Riesz representation theorem, $f$ can be represented as

$$f(X) = \langle U, X \rangle = \mathrm{tr}(U^TX).$$

for a fixed $U \in \mathrm{Mat}(m, n)$.

Clearly $U = (BA)^T$ does the job, so the more precise statement is

$$\mathrm{tr}(AXB) = \langle (BA)^T, X \rangle,$$

which is a triviality.