0
$\begingroup$

In Boyd's Convex Optimization text, on page 86, there is the following equation:

$f(x) = h(g(x)) = h(g_1(x), \cdots, g_k(x))$

Then, in order to show convexity, we take the second derivative, yielding:

f''(x) = g'(x)^T \nabla^2h(g(x))g'(x) + \nabla h(g(x))^T g''(x), with $h:R^k \rightarrow R, g_i:R^n \rightarrow R$

There was a previous equation on page 71 which was similar, but yielded the gradient of the function f(x). This time, the second derivative is a single number. I was wondering how do I get the second derivative calculation? I know it is a derivative of some kind of a composition function. h() is a scalar function, that is, it takes in a vector and returns a real number. g() is a vector function, that is, it takes in a number and returns a vector.

Thanks

1 Answers 1

1

Let's start with the first derivative. Suppose $x$ is increased by $\epsilon$. Then $g_i(x)$ increases by g'_i(x) \epsilon. So $f$ increases by \sum_{i=1}^k \frac{\partial h}{\partial t_i}(g(x)) g'_i(x) = \nabla h(g(x))^T g'(x). Here $t_i$ is the name of the $i$th parameter of $h$, and this formula uses the multilinear approximation of a function around its derivative: if any given coordinate increases by some small amount, then the entire function increases proportionally to the amount of increase and the relevant derivative; if this happens at several coordinates, these add up approximately "independently".

Now all you have to do is repeat this process once again - the second derivative is, after all, the derivative of the first derivative.