-1
$\begingroup$

Assume the following:

$(1)$ Let $f: R \to R$.

$(2)$ Moreover, let $f = g \circ h$,

$(3)$ where $h:R\to R^d$, and $g: R^d \to R$

then the derivative of $f$ can be written as: $f' =h' \cdot g'\circ h$ i.e.:

$(4)$ $f'(x)=h'(x) \cdot g'(h(x))$

However, since $h$ and $g$ are not straightforward functions from $R$ to $R$, I am not sure how to interpret this equation correctly.

So my questions:

  1. Is equation $(4)$ correct, and is there a single correct way of interpreting it?

  2. How should I interpret $(4)$ so that $h'(x)$ and $g'(h(x))$ can be explicitly calculated separately?

ps. I am aware of the existence of gradients and such in multivariate calculus, and have some experience with them, but apparently not enough to know what to do in this situation.

1 Answers 1

2

$\newcommand{\vec}[1]{\boldsymbol{#1}}$(4) is a special case of the multivariate chain rule, with $m=p=1,n=d$.

Let $m,n,p \in \mathbb{N}$ and $\vec{f}:\mathbb{R}^n \to \mathbb{R}^m,\vec{g}:\mathbb{R}^p \to \mathbb{R}^n$. If $\vec{g}$ is differentiable at $\vec{x}$ and $\vec{f}$ is differentiable at $\vec{g}(\vec{x})$, then $\vec{f} \circ \vec{g}$ is differentiable at $\vec{x}$ with $$(\vec{f} \circ \vec{g})'(\vec{x}) = \vec{f}'(\vec{g}(\vec{x}))\vec{g}'(\vec{x})$$

Here $\vec{f}' \equiv \mathrm D\vec{f}$ is the local linear approximation of $\vec{f}$ by a linear map a.k.a. Fréchet Derivative. Notice that your formula for (4) is not right since $\vec h'$ is a column vector while $g'$ is a row vector. You can't put $\vec h'$ on the left as such multiplications do not make sense. Moreover you need the assumption that $\vec h',g'$ exists.

In your case, $\vec{h}'$ is the vector $\left[h_0',\dots,h_{d-1}'\right]^\intercal$ with elements being the derivative of the components of $\vec{h}$, and $g'$ is the derivative of $g$ given in a matrix. The transpose of this matrix is known as the gradient of $g$ and denoted $\nabla g$.