0
$\begingroup$

I am currently reading Matrix Differential Calculus (Magnus, 2007, p.107...). Notation/computation-wise, the derivative of a matrix function $F(A)$ - where $A$ is a matrix - is taken to be (cf. p.196):

$$ DF(A) = Df(\text{vec} A) = \frac{\partial \text{vec} F(A) }{\partial (\text{vec} A)^T} $$

This assumes that a vector function $f$ exists such that:

$$ f(\text{vec}A) = \text{vec}F(A) $$

I hardly understand this. Function $F$ can depend on the order of the matrix which $f$ doesn't have. So the order of the matrix must be built-in explicitly in $f$? Anyhow, I can live with it.

Now I want to extend this definition to multivariate matrix functions, for example $G(A,B)$, where $A$ and $B$ are matrices. Am I to map both matrices to a vector $\text{vec}(A,B)$ (?) and then have:

$$ DG(A,B) = Dg(\text{vec}(A,B)) $$

If so, will this $DG(A,B)$ matrix still have meaningful properties - like it has in $DF(A)$-case according to the book? Citation:

In view of Definition 3 [p.107], all calculus properties of matrix functions follow immediately from the corresponding properties of vector functions because, instead of the matrix function $F$, we can consider the vector function $f$.

And (p.197):

Since DF(X) is a straightforward matrix generalization of the traditional definition of the Jacobian matrix ∂f(x)/∂x′, all properties of Jacobian matrices are preserved. In particular, questions relating to functions with non-zero Jacobian determinant at certain points remain meaningful.

Sidenote: I am mainly trying to derive the Hessian matrix and see whether it is positive (semi)definite. In my case $G$ is actually the Jacobian matrix of a real-valued function in $A$ and $B$.

1 Answers 1

1

Any matrix function $F$ mapping matrices from $\mathbb R^{n\times n}$ to matrices from $\mathbb R^{m\times m}$ can be represented as

$$F(A)=\begin{bmatrix}F_{1,1}(A) & F_{1,2}(A) & \dots & F_{1,m}(A)\\F_{2,1}(A) & F_{2,2}(A) & \dots & F_{2,m}(A)\\ \vdots & \vdots & \ddots & \vdots\\ F_{m,1} & F_{m,2}(A) & \dots & F_{m,m}(A) \end{bmatrix}$$

This can be written simply by "vectorizing" the same functions to get a function $f$ from $\mathbb R^{n\times n}$ to $\mathbb R^{m\cdot m}$

$$f(A)=\begin{bmatrix} F_{1,1}(A)\\ F_{1,2}(A)\\ \vdots\\ F_{1,m}(A)\\ F_{2,1}(A)\\ \vdots\\ F_{2,m}(A)\\ \vdots\\ F_{m,1}(A)\\ \vdots \\ F_{m,m}(A) \end{bmatrix}$$

  • 0
    Ok, thanks. That is how Magnus also sees it and clears up one question. Note though that `vec` operator is defined as "stacking the columns of the matrix one underneath the other" according to the book - and you are stacking rows. That is maybe not universal. How about my second question?2017-01-24