1
$\begingroup$

I would like to derive the derivative of the quadratic form $\frac{d}{dx} x^T Ax$ using directional derivative. I have saw different derivations (for example, this one, which expands everything), but I thought directional derivative is more natural. So I tried the following:

Let $t$ be a real number, and $v$ be a vector. \begin{align} \frac{d}{dx} x^T A x &= \lim_{t\to 0} \frac{(x+tv)^T A (x+tv)-x^T A x}{t}\\ &=\lim_{t\to 0} \frac{(x^T + tv^T)(Ax + tAv) - x^T A x}{t}\\ &= \lim_{t\to 0} \frac{x^T Ax + tv^T Ax + tx^T Av + t^2 v^T Ax - x^T Ax}{t}\\ & =v^T Ax + x^T Av \end{align}

Notice that since $x^T Av \in \mathbb{R}$, we have $x^T Av = v^T A^T x$. As a result, we have $\frac{d}{dx} x^T Ax = v^T(A + A^T) x$

But how do I get rid of the $v^T$ in the final result?

  • 0
    Why do you think you *can* to get rid of it?2017-02-11
  • 0
    @DonAntonio I was _hoping_ that I could, since I know the derivative should be $(A+A^T) x$. I'm just a little bit confused here :(2017-02-11
  • 0
    @Mi But *you decided* , for some reason not so clear to me, to take **the directional** derivative...in the direction of $\;v\;$ , so I'd say it **must** appear in the final expression.2017-02-11
  • 0
    @DonAntonio Right. My plan was that I would change $v$ (some arbitrary direction) to $x$ in my last step, so that I could get $\frac{df}{dx}$. But even then, I would still have $x^T (A + A^T) x$?2017-02-11
  • 0
    Comment about the notations : do not use $d/dx$ for directional derivatives of multivariate functions. Instead, let $f \, : \, x \in \mathbb{R}^n \, \mapsto \, x^{\top}Ax$ and write : $$ df(x)(v) \quad \text{or} \quad \mathrm{D}_{x}f \cdot v $$ for the directional derivative of $f$ at $x$ in the direction of $v$.2017-02-11
  • 0
    @Misakov I think jibounet has a point: what you really want, I think, is the *differential* (or the derivative, just like that) ofthe quadratic form.2017-02-11
  • 0
    @jibounet Oh I see. So I actually have what I want already. I just forgot that the directional derivative the product of derivative _and_ the direction :| Awkward2017-02-11
  • 0
    @Misakov In general, the directional derivative is not the product of "the derivative" and the direction. Also, the notion of "derivative" does not exist for multivariate functions. It is replaced with either *directional derivative* or *differential*.2017-02-11
  • 0
    @jibounet I meant to say the product of the gradient (the $\nabla$ thingy) and the direction. I hope that is correct? People doing machine learning seem to call everything derivative and that confused me a little bit.2017-02-12

2 Answers 2

1

Quite right all that you have, but better write say $$ d_xFv =v^T Ax + x^T Av=x^T(A^T+A)v, $$ where $F(x)=x^T A x$. So the derivative is $d_xF=x^T(A^T+A)$ (not really what you say it is but almost, just a matter of column verus row).

2

Let me try from the beginning as I see it, putting $\;Q(X):=X^TAX\;$ and without directional derivatives and stuff:

$$Q(X+h)-Q(X)=(X+h)^TA(X+h)-X^TAX=$$

$$=\require{cancel}\cancel{X^TAX}+X^TAh+h^TAX+h^TAh-\cancel{X^TAX}=X^TAh+h^TAX+h^TAh\implies$$

$$\frac{Q(X+h)-Q(X)}h=\frac{X^TAh+\overbrace{h^TAX}^{(X^TA^Th)^T}+h^TAh}h=$$

$$=\frac{X^T(A+A^T)h}h-\overbrace{h^TA}^{=hA\rightarrow0}\xrightarrow[\left\|h\right\|\to0]{}X^T(A+A^T)$$