0
$\begingroup$

I encounter a difficulty in one of my research dealing with derivative of a matrix. I will appreciate much for your expertise on this.

Let $\mathbf{x }=vec\left( X \right) $ where $X $ is a $T\times r $ matrix of unknown parameters ($T$ and $r$ are constant, $T>>r$) and $vec$ is the vectorization, let \begin{equation} \mathbf{H}\left( \mathbf{X }\right) =\left( X ,\mathbf{I}_{T}\right) ^{\prime }, \label{def_H_theta} \end{equation} where $\mathbf{I}_T$ is $T \times T $ identity matrix. Let \begin{equation} \Omega \left( \mathbf{X }\right) _{T\times T}=\left( \mathbf{I}% _{T}+XX ^{\prime }\right) . \label{var-cov_TS_multi} \end{equation} What is the derivative of \begin{equation} \frac{\partial \log \left\vert \Omega \left( \mathbf{X }\right) \right\vert }{\partial \mathbf{x }}=?? \end{equation} $|A|$ is the determinant of $A$, and \begin{equation} \frac{\partial tr\left[ \mathbf{H}\left( \mathbf{X }\right) ^{\prime }\Omega \left( \mathbf{X }\right) ^{-1}\mathbf{H}\left( \mathbf{% X }\right) \right] }{\partial \mathbf{x }}=?? \end{equation} where $tr$ is the trace operator. Many thanks!

1 Answers 1

1

For typing convenience, let me substitute Latin in place of your Greek letters $$\eqalign{ X &= \Xi \cr x &= \vartheta = \operatorname{vec}(X) \cr W &= \Omega = I + XX^T = W^T \cr\cr }$$

Then for your first function, the differential and gradient can be calculated as $$\eqalign{ f &= \log\det W \cr \cr df &= d\log\det W = d\operatorname{tr}\log W \cr &= W^{-T}:dW \cr &= W^{-T}:(dX\,X^T+X\,dX^T) \cr &= \big(W^{-T}X+W^{-1}X\big):dX \cr &= 2\,W^{-1}X:dX \cr &= 2\,\operatorname{vec}(W^{-1}X)\cdot\,dx \cr \cr \frac{\partial f}{\partial x} &= 2\,\operatorname{vec}(W^{-1}X) \cr\cr }$$ In your second function, I don't quite understand the definition of $\,H(\vartheta)\,$ so all I can offer is a partial solution $$\eqalign{ M &= W^{-1} = M^T \cr f &= H:MH \cr \cr df &= dH:MH + H:M\,dH + H:dM\,H \cr &= (M+M^T)H:dH + HH^T:dM \cr &= 2\,MH:dH - HH^T:M\,dW\,M \cr &= 2\,MH:dH + MHH^TM:dW \cr &= 2\,MH:dH + MHH^TM:(dX\,X^T+X\,dX^T) \cr &= 2\,MH:dH + 2\,MHH^TMX:dX \cr \cr }$$ You can finish off the solution by expanding $dH$ in terms of $dX$, then vectorizing.

In the above, a colon denotes the double-dot (aka Frobenius) product, which is merely a product notation for the trace, i.e. $$A:B=\operatorname{tr}(A^TB)$$


Update

I'm guessing that $H$ is a partitioned matrix: $H=[\,X, I\,]$

Expanding that term in the differential $$\eqalign{ MH:dH &= [\,MX, M\,]:[\,dX, 0\,] \cr &= MX:dX + M:0 \cr }$$ So, continuing with the full differential $$\eqalign{ df &= 2\,\Big(MX + MHH^TMX\Big):dX \cr &= 2\,\operatorname{vec}(MX + MHH^TMX)\cdot dx \cr \cr \frac{\partial f}{\partial x} &= 2\,\operatorname{vec}(MX + MHH^TMX) \cr }$$

  • 0
    Hi, Greg, many thanks for your very detailed solution, which is very helpful. I have revised the question using your notation. The definition of H is basically a matrix containing X and an identity matrix, will this simplify the solution?2017-02-10
  • 1
    Many thanks, Greg, you are indeed very helpful. Could you please let me know your full name such that I can thank you in my paper. Thanks!!2017-02-10