1
$\begingroup$

I am studying matrix calculus and I don't quite understand why the gradient only exists for functions that take $m \times n$ matrices as input and that output a scalar, i.e., functions of the type $\mathbb R^{m \times n} \to \mathbb R$.

  • 0
    Well, for function whose outputs are vectors the gradient has another name: the Jacobian. For which other kinds of function do you think gradient should be defined?2017-01-11
  • 0
    @Crostul Ah I see. So the term gradient is solely to differentiate between $f:\Re^{m\times n}\implies \Re$ from $f:\Re^{m\times n}\implies \Re^{a\times b}$. I'm guessing gradient has special properties that jacobian doesn't have.2017-01-11

1 Answers 1

0

The gradient of a differentiable function $f : \mathbb R^{m \times n} \to \mathbb R$

$$\nabla f (\mathrm X) : \mathbb R^{m \times n} \to \mathbb R^{m \times n}$$

is the matrix-valued function that produces the directional derivative of $f$ in the direction of $\mathrm V \in \mathbb R^{m \times n}$ at $\mathrm X \in \mathbb R^{m \times n}$ via the following Frobenius inner product

$$\langle \mathrm V, \nabla f (\mathrm X) \rangle$$

If the output of a function is not a scalar, then its directional derivative will not be a scalar either. Hence, there is no way of producing the directional derivative via some inner product, as inner products produce scalars (by definition).