24
$\begingroup$

It's stated that the gradient of:

$$\frac{1}{2}x^TAx - b^Tx +c$$

is

$$\frac{1}{2}A^Tx + \frac{1}{2}Ax - b$$

How do you grind out this equation? Or specifically, how do you get from $x^TAx$ to $A^Tx + Ax$?

  • 1
    did you understand the answer?2017-12-04

2 Answers 2

24

The only thing you need to remember/know is that $$\dfrac{\partial (x^Ty)}{\partial x} = y$$ and the chain rule, which goes as $$\dfrac{d(f(x,y))}{d x} = \dfrac{\partial (f(x,y))}{\partial x} + \dfrac{d( y^T(x))}{d x} \dfrac{\partial (f(x,y))}{\partial y}$$ Hence, $$\dfrac{d(b^Tx)}{d x} = \dfrac{d (x^Tb)}{d x} = b$$

$$\dfrac{d (x^TAx)}{d x} = \dfrac{\partial (x^Ty)}{\partial x} + \dfrac{d (y(x)^T)}{d x} \dfrac{\partial (x^Ty)}{\partial y}$$ where $y = Ax$. And then, that is,

$$\dfrac{d (x^TAx)}{d x} = \dfrac{\partial (x^Ty)}{\partial x} + \dfrac{d( y(x)^T)}{d x} \dfrac{\partial (x^Ty)}{\partial y} = y + \dfrac{d (x^TA^T)}{d x} x = y + A^Tx = (A+A^T)x$$

  • 7
    To help future generations: the full specification of the chain rule used here is $$ \frac{df(g,h)}{dx} = \frac{d(g(x)^T)}{dx} \frac{\partial f(g,h)}{\partial g} + \frac{d(h(x)^T)}{dx} \frac{\partial f(g,h)}{\partial h} $$ The order of multiplication is very important since we're dealing with vectors!2014-09-23
  • 0
    the first statement seems wrong to me. Isn't the right statement $ \nabla_x(x^Ty) = y$? @NeilTraft2017-10-21
  • 0
    Like for example how does the answerer know where $\dfrac{\partial y^T}{\partial x}$ goes on the left or on the right or if there is a transpose or not? Or maybe I just unfamiliar with the chain rule using gradients and I only know it using partial derivatives.2017-10-21
  • 1
    Also notice that the derivative wrt a column vector is a row vector, and vice versa. (I learned this from @copper.hat https://math.stackexchange.com/questions/189434/derivative-of-quadratic-form). However, the _gradient_ is represented as a column vector.2018-05-20
  • 0
    Where can someone learn about these differentiation rules? In my standard analysis and calculus courses, we didn't see the differentiation of matrices or vectors, only of multivariate functions2018-10-15