I am studying matrix calculus and I don't quite understand why the gradient only exists for functions that take $m \times n$ matrices as input and that output a scalar, i.e., functions of the type $\mathbb R^{m \times n} \to \mathbb R$.
Why does the gradient of function $f$ only exists for function that outputs scalars?
1
$\begingroup$
derivatives
self-learning
vector-analysis
matrix-calculus
-
0Well, for function whose outputs are vectors the gradient has another name: the Jacobian. For which other kinds of function do you think gradient should be defined? – 2017-01-11
-
0@Crostul Ah I see. So the term gradient is solely to differentiate between $f:\Re^{m\times n}\implies \Re$ from $f:\Re^{m\times n}\implies \Re^{a\times b}$. I'm guessing gradient has special properties that jacobian doesn't have. – 2017-01-11
1 Answers
0
The gradient of a differentiable function $f : \mathbb R^{m \times n} \to \mathbb R$
$$\nabla f (\mathrm X) : \mathbb R^{m \times n} \to \mathbb R^{m \times n}$$
is the matrix-valued function that produces the directional derivative of $f$ in the direction of $\mathrm V \in \mathbb R^{m \times n}$ at $\mathrm X \in \mathbb R^{m \times n}$ via the following Frobenius inner product
$$\langle \mathrm V, \nabla f (\mathrm X) \rangle$$
If the output of a function is not a scalar, then its directional derivative will not be a scalar either. Hence, there is no way of producing the directional derivative via some inner product, as inner products produce scalars (by definition).