2
$\begingroup$

Let $A\in\mathbb{R}^{n\times m}$, $n\geq m$, be a full column rank matrix, and consider the function \begin{align} f&\colon \mathbb{R}^{n\times n} \to \mathbb{R}^{n\times n}\\ & X\mapsto A (A^\top X A)^{-1} A^\top, \end{align} where $\bullet^\top$ denotes transposition.

Assuming that $(A^\top X A)^{-1}$ exists, I'm interested in the computation of the Jacobian matrix of $f$, i.e. $$\tag{1}\label{a} \mathbf{J}[f] = \left[\frac{\partial f(X)}{\partial X_{ij}}\right]\in\mathbb{R}^{n^2\times n^2}. $$

I know that there exists a closed form expressions for the Jacobian of the inverse, namely $\mathbf{J}[X^{-1}]=-(X^{-\top} \otimes X^{-1})$ (see e.g. here, page 5). Hence, I wonder whether a similar closed-form expression can be derived for \eqref{a}.

Thanks in advance.

1 Answers 1

2

Given $\mathrm A \in \mathbb R^{n \times m}$, matrix-valued function $\mathrm F : \mathbb R^{n \times n} \to \mathbb R^{n \times n}$ is defined as follows

$$\mathrm F (\mathrm X) := \mathrm A \left( \mathrm A^{\top} \mathrm X \mathrm A \right)^{-1} \mathrm A^{\top}$$

Hence,

$$\mathrm F (\mathrm X + h \mathrm V) = \mathrm A \left( \mathrm A^{\top} (\mathrm X + h \mathrm V) \mathrm A \right)^{-1} \mathrm A^{\top} = \cdots = \mathrm F (\mathrm X) - h \mathrm A \left( \mathrm A^{\top} \mathrm X \mathrm A \right)^{-1} \mathrm A^{\top} \mathrm V \mathrm A \left( \mathrm A^{\top} \mathrm X \mathrm A \right)^{-1} \mathrm A^{\top}$$

Thus, the directional derivative of $\mathrm F$ in the direction of $\mathrm V$ at $\mathrm X$ is the matrix-valued function

$$- \mathrm A \left( \mathrm A^{\top} \mathrm X \mathrm A \right)^{-1} \mathrm A^{\top} \mathrm V \mathrm A \left( \mathrm A^{\top} \mathrm X \mathrm A \right)^{-1} \mathrm A^{\top}$$

Making $\mathrm V = \mathrm e_i \mathrm e_j^{\top}$, we obtain

$$\partial_{x_{ij}} \mathrm F (\mathrm X) = - \mathrm A \left( \mathrm A^{\top} \mathrm X \mathrm A \right)^{-1} \mathrm A^{\top} \mathrm e_i \mathrm e_j^{\top} \mathrm A \left( \mathrm A^{\top} \mathrm X \mathrm A \right)^{-1} \mathrm A^{\top} = \color{blue}{- \mathrm F (\mathrm X) \, \mathrm e_i \mathrm e_j^{\top} \mathrm F (\mathrm X)}$$

which is a multiple of the outer product of the $i$-th column and $j$-th row of $\mathrm F (\mathrm X)$.

Vectorizing the directional derivative, we obtain

$$\mbox{vec} \left( - \mathrm A \left( \mathrm A^{\top} \mathrm X \mathrm A \right)^{-1} \mathrm A^{\top} \mathrm V \mathrm A \left( \mathrm A^{\top} \mathrm X \mathrm A \right)^{-1} \mathrm A^{\top} \right) = \color{blue}{- \left( \mathrm A \left( \mathrm A^{\top} \mathrm X^{\top} \mathrm A \right)^{-1} \mathrm A^{\top} \otimes \mathrm A \left( \mathrm A^{\top} \mathrm X \mathrm A \right)^{-1} \mathrm A^{\top} \right)} \mbox{vec} (\mathrm V)$$

  • 0
    Thank you for your answer! Before accepting it, I would like to ask you a (perhaps, silly) clarification. I’m a bit confused on the “right” notion of Jacobian for a matrix-valued function, namely between the classical one (your first blue formula) and the vectorized one (your second blue formula). For instance, if I would like to check whether or not the Jacobian is non-singular for a given positive definite $X$, which formula do I have to consider?2017-02-11
  • 1
    @Jacquard Take my formula and make $\mathrm A = \mathrm I_n$. You then obtain $- (\mathrm X^{-\top} \otimes \mathrm X^{-1})$, which is the formula in the question and also in Fackler's notes. Hence, I suppose that is the "right" formula.2017-02-11
  • 1
    @Jacquard If $- \left( \mathrm A \left( \mathrm A^{\top} \mathrm X^{\top} \mathrm A \right)^{-1} \mathrm A^{\top} \otimes \mathrm A \left( \mathrm A^{\top} \mathrm X \mathrm A \right)^{-1} \mathrm A^{\top} \right)$ is singular, then there is at least one direction $\mathrm V$ for which the directional derivative of $\mathrm F$ is zero. That would be one critical point.2017-02-11
  • 1
    @Jacquard The directional derivative is $- \mathrm A \left( \mathrm A^{\top} \mathrm X \mathrm A \right)^{-1} \mathrm A^{\top} \mathrm V \mathrm A \left( \mathrm A^{\top} \mathrm X \mathrm A \right)^{-1} \mathrm A^{\top}$. If the *rows* or *columns* of $\mathrm V$ are in the left null space of $\mathrm A$, then the directional derivative is zero.2017-02-11