2
$\begingroup$

I have $f(x)=g(ax+b)$, a and b are constant.

I need to show that $\nabla f(x)=a\nabla g(x)$ and $\nabla^2 f(x)=a^2\nabla^2 g(x)$...

I was thinking that the final answer should have ax+b in it, but apparently it can be shown that the above is true???

  • 0
    The formulas should be $\nabla f(x) = a \nabla g(ax + b)$ and $\nabla^2 f(x) = a^2 \nabla^2 g(ax + b)$. (I just provided a derivation below.)2012-11-26

1 Answers 1

2

In this problem $f(x) = g(h(x))$, where $h(x) = ax + b$. I'm going to consider the case where $a$ is a matrix rather than a scalar, because it's useful and no more difficult. You can assume $a$ is a scalar if you'd like.

Let's establish some notation. Recall that if $F:\mathbb R^n \to \mathbb R^m$ is differentiable at $x$, then $F'(x)$ is an $m \times n$ matrix. In the special case where $m = 1$, $F'(x)$ is a $1 \times n$ matrix. I'm going to use the convention that $\nabla F(x) = F'(x)^T$, so $\nabla F(x)$ is a column vector rather than a row vector. Then $G(x) = \nabla F(x)$ is a function from $\mathbb R^n \to \mathbb R^n$, and $\nabla^2 F(x) = G'(x)$, which is an $n \times n$ matrix.

The chain rule tells us that \begin{align} f'(x) &= g'(h(x))h'(x) \\ &= g'(ax + b) a. \end{align} It follows that \begin{align} \nabla f(x) &= a^T g'(ax+b)^T \\ &= a^T \nabla g(ax + b). \end{align} That is our formula for $\nabla f(x)$.

Preparing to use the chain rule again, we can express $\nabla f(x)$ as $\nabla f(x) = w(h(x))$, where $w(x) = a^T \nabla g(x)$. Note that $w'(x) = a^T \nabla^2 g(x)$. Applying the chain rule to $z(x) = \nabla f(x) = w(h(x))$, we see that \begin{align} \nabla^2 f(x) &= w'(h(x))h'(x) \\ &= a^T \nabla^2 g(ax + b) a. \end{align} This is our formula for $\nabla^2 f(x)$.

  • 2
    thanks though for the answer2012-11-26