5
$\begingroup$

Given the following definition:

definition for /delta

How to proof these two equations?

How to proof this?

and

enter image description here

PS:

Actually, there are two proofs preceding the two(I have no problem with the following two), they are: enter image description here

Maybe they are hints on solving the latter two.

I encounter this problem here(section 2.1 about page 8~page 9)

2 Answers 2

2

By writing it out in index notation (personal preference), the first equation is simple application of the product rule (see Einstein notation)

\begin{align} \nabla_{A_{ij}} \delta_{kl} A_{km}B_{mn} A^T_{np} C_{pl} = \newline \nabla_{A_{ij}} A_{km}B_{mn} A_{pn} C_{pk} = \newline (\nabla_{A_{ij}} A_{km})B_{mn} A_{pn} C_{pk} + (\nabla_{A_{ij}}A_{pn}) A_{km}B_{mn}C_{pk} = \newline (\delta_{ik}\delta_{jm})B_{mn} A_{pn} C_{pk} + (\delta_{ip}\delta_{jn}) A_{km}B_{mn} C_{pk} = \newline B_{jn} A_{pn} C_{pi} + A_{km} B_{mj} C_{ik} = \newline C^T\cdot A\cdot B^T + C\cdot A \cdot B \end{align}

as for the second equation, I've only seen it derived by rewriting $|A|$ in terms of its eigenvalues and doing some tricks or Jacobi's formula. I don't think the two preceding equations give you much to work with here.

2

Let $X=Y=A$. We have $(\ast): \textrm{tr} XBY^TC = \textrm{tr} CXBY^T = \textrm{tr} (CXBY^T)^T = \textrm{tr} Y(CXB)^T$. So $ \begin{eqnarray*} \nabla_A \textrm{tr} ABA^TC &=& \nabla_X \textrm{tr} XBY^TC + \nabla_Y \textrm{tr} XBY^TC \quad\textrm{(by chain rule)}\\ &=& \nabla_X \textrm{tr} X(BY^TC) + \nabla_Y \textrm{tr} Y(CXB)^T\quad(\textrm{by }(\ast))\\ &=& (BY^TC)^T + CXB = C^TAB^T + CAB \end{eqnarray*} $ and we get the first result. The second result is more straightforward. Recall that for any fixed $i$, by Laplace expansion, we have $\det A=\sum_j (-1)^{i+j}A_{ij}M_{ij}(A)$, where $M_{ij}(A)$ denotes the $(i,j)$-minor of $A$. Since the computation of $M_{ij}(A)$ does not involve $A_{ij}$, we have $\frac\partial{\partial A_{ij}}\det A=(-1)^{i+j}M_{ij}(A)=C_{ji}(A)$, where $C_{kl}(A)$ denotes the $(k,l)$-cofactor of $A$. Hence $\nabla_A\det(A)=\textrm{adj}(A)^T=(\det A)(A^{-1})^T$.

  • 1
    Here chain rule means that for the function $g(X,Y)=\textrm{tr}XBY^TC$, we have $\frac{\partial g}{\partial a_{ij}} = \frac{\partial g}{\partial x_{ij}}\frac{dx_{ij}}{\partial a_{ij}} + \frac{\partial g}{\partial y_{ij}}\frac{dy_{ij}}{\partial a_{ij}}$. As $a_{ij}=x_{ij}=y_{ij}$, we get $\frac{\partial g}{\partial a_{ij}} = \frac{\partial g}{\partial x_{ij}} + \frac{\partial g}{\partial y_{ij}}$. This holds for each pair of $(i,j)$. Hence $\nabla_Ag=\nabla_Xg+\nabla_Yg$.2011-09-26