1
$\begingroup$

Let $F(m)=m^T m$ with $m$ a $n\times n$ matrix. I came across the statement $D_I F(m)=m^t+m$, where $D_iF$ means the derivative of $F$ at the identity matrix.

I cannot understand how this emerges. I can imagine taking the directional derivative of $F$ at $I$ by $\lim_{\lambda\rightarrow0} \frac{F(I+\lambda m)-F(I)}{\lambda}=\lim_{\lambda\rightarrow0} \frac{\lambda (m^T+m)+\lambda^2m^T m}{\lambda}=m^T+m.$

Is that the meaning of $D_I F(m)$? Is there a way to define a "general" derivative of a map from the space of matrices to the space of matrices?

1 Answers 1

1

We have $F(m+h)-F(m)=(m+h)^t(m+h)-m^tm=\color{red}{m^tm}+m^th+h^tm+h^th\color{red}{-m^tm}\\=m^th+h^tm+h^th.$ As $\frac{\lVert h^th\rVert}{\lVert h\rVert}\leq \lVert h^t\rVert$ (taking any submultiplicative norm on the space of $n\times n$ matrices, and $h\mapsto m^th+h^tm$ is linear, we get that $D_mF(h)=m^th+h^tm$. Indeed, $\lim_{\lVert h\rVert\to 0}\frac{\lVert F(m+h)-F(x)-D_mF(h)\rVert}{\lVert h\rVert}=0.$