1
$\begingroup$

I am quite unsure about this whole matter of differentiation with respect to a matrix. First, I'd like a good (online hopefully) reference for getting up to speed on the theory - as opposed to a bunch of results. Essentially I'm needing to optimize a function of a symmetric matrix $\Sigma$. A special case that I think would be informative for me to see worked is the case where $\ell$ is the normal log-likelihood $\ell(\mu, \Sigma) = K - \frac n 2 \log|\Sigma| - \frac{\sum_{i = 1} ^ n (y_i - \mu)^T \Sigma^{-1} (y_i - \mu)}{2}$ where $\mu, y_i \in \mathbb R^{t}$ and $y_1, ..., y_n$ can be regarded as fixed. Of coure, $\Sigma$ is restricted to be symmetric. The goal is to optimize $\ell$ wrt $\mu, \Sigma$. It's easy to see $\mu$ is optimized by $\bar y = \frac 1 n \sum y_i$ independent of $\Sigma$ so only the calculations involving $\Sigma$ are needed. The answer should end up being the usual (biased) empirical covariance matrix.

1 Answers 1

0

I have tracked down a couple of papers which both provide a bunch of formulas specific to symmetric and symmetric positive-definite matricies, develop a bit of the theory, and work this precise example. For posterity, they are Dwyer - 1967 and McCulloch - 1982; just google those with "matrix derivative" and they'll pop up.