16
$\begingroup$

For the Quadratic Form $X^TAX; X\in\mathbb{R}^n, A\in\mathbb{R}^{n \times n}$ (which simplifies to $\Sigma_{i=0}^n\Sigma_{j=0}^nA_{ij}x_ix_j$), I tried to take the derivative wrt. X ($\Delta_X X^TAX$) and ended up with the following:

The $k^{th}$ element of the derivative represented as

$\Delta_{X_k}X^TAX=[\Sigma_{i=1}^n(A_{ik}x_k+A_{ki})x_i] + A_{kk}x_k(1-x_k)$

Does this result look right? Is there an alternative form?

I'm trying to get to the $\mu_0$ of Gaussian Discriminant Analysis by maximizing the log likelihood and I need to take the derivative of a Quadratic form. Either the result I mentioned above is wrong (shouldn't be because I went over my arithmetic several times) or the form I arrived at above is not the terribly useful to my problem (because I'm unable to proceed).

I can give more details about the problem or the steps I put down to arrive at the above result, but I didn't want to clutter to start off. Please let me know if more details are necessary.

Any link to related material is also much appreciated.

4 Answers 4

39

Let $Q(x) = x^T A x$. Then expanding $Q(x+h)-Q(x)$ and dropping the higher order term, we get $DQ(x)(h) = x^TAh+h^TAx = x^TAh+x^TA^Th = x^T(A+A^T)h$, or more typically, $\frac{\partial Q(x)}{\partial x} = x^T(A+A^T)$.

Notice that the derivative with respect to a column vector is a row vector!

1

It is easier using index notation with Einstein (repeated sum on dummy indices) rule. That is, we can write the $i$th component of $Ax$ as $a_{ij} x_j$, and $x^T A x=x_i a_{ij} x_j = a_{ij} x_i x_j$. Then take the derivative of $f(\bf{x})$ with respect to a component $x_k$. We find \begin{eqnarray} \partial f/\partial x_k = f,_k = a_{ij} x_{i,k} x_j + a_{ij} x_i x_{j,k} = a_{ij} \delta_{ik} x_j + a_{ij} x_i \delta_{jk} = a_{kj} x_j + a_{ik} x_i, \end{eqnarray} which in matrix notation is $k$th component of ${\bf{x}}^T A + {\bf{x}}^T A^T$.

-1

$f(x) = 0.5x^\top Ax \Rightarrow Df(x) = Ax $

  • 0
    This is not right. @copper.hat's answer is correct.2018-06-19
-2

I just learned a new trick when your independent variable is in more than two places within your formula: introduce a new (fake) parameter which will then disappear:

$\frac{\partial}{\partial x} y^TAx = \frac{\partial y}{\partial x}[Ax]^T+y^TA $ The transpose was to make the vector a row vector. Nothing deep there!

Now, if $y=x$ then $ \frac{d}{dx} x^TAx = x^TA^T+x^TA = x^T(A+A^T) \ . $