It's stated that the gradient of:
$$\frac{1}{2}x^TAx - b^Tx +c$$
is
$$\frac{1}{2}A^Tx + \frac{1}{2}Ax - b$$
How do you grind out this equation? Or specifically, how do you get from $x^TAx$ to $A^Tx + Ax$?
It's stated that the gradient of:
$$\frac{1}{2}x^TAx - b^Tx +c$$
is
$$\frac{1}{2}A^Tx + \frac{1}{2}Ax - b$$
How do you grind out this equation? Or specifically, how do you get from $x^TAx$ to $A^Tx + Ax$?
The only thing you need to remember/know is that $$\dfrac{\partial (x^Ty)}{\partial x} = y$$ and the chain rule, which goes as $$\dfrac{d(f(x,y))}{d x} = \dfrac{\partial (f(x,y))}{\partial x} + \dfrac{d( y^T(x))}{d x} \dfrac{\partial (f(x,y))}{\partial y}$$ Hence, $$\dfrac{d(b^Tx)}{d x} = \dfrac{d (x^Tb)}{d x} = b$$
$$\dfrac{d (x^TAx)}{d x} = \dfrac{\partial (x^Ty)}{\partial x} + \dfrac{d (y(x)^T)}{d x} \dfrac{\partial (x^Ty)}{\partial y}$$ where $y = Ax$. And then, that is,
$$\dfrac{d (x^TAx)}{d x} = \dfrac{\partial (x^Ty)}{\partial x} + \dfrac{d( y(x)^T)}{d x} \dfrac{\partial (x^Ty)}{\partial y} = y + \dfrac{d (x^TA^T)}{d x} x = y + A^Tx = (A+A^T)x$$