3
$\begingroup$

Short question:

How can I calculate $\dfrac{\partial A}{\partial L}$ where $A = \|Lx\|^2_2= x^TL^TLx$?

Is it $\dfrac{\partial A}{\partial L}=2Lx^tx$?

Long question:

I want to calculate the gradient of a Mahalanobis distance. More specifically, I like to calculate the gradient of $A$ in terms of $L$, ($\frac{\partial A}{\partial L}$).

$A = \|Lx\|^2_2= x^TL^TLx$

I expand the equation and calculate the gradient element by element and it seems it should be something like $\frac{\partial A}{\partial L}=2Lx^tx$. But, it's very slow! Would you please confirm the correctness of answer? and help me find a faster approach?

Thanks

  • 0
    Hint to check your solution: what happens when L is a scalar?2012-03-22

4 Answers 4

5

It's inded tedious to do it by hand.

See first that $\frac{\partial {\bf L x }}{\partial L_{ij}}= {\bf P^{ij} x}$

where ${\bf P^{i,j}}$ is the "singleentry matrix": it is zero everywhere except in the entry $i,j$ where it's 1. (This can be checked, an also deduced from the product rule, knowing that $\frac{\partial {\bf L }}{\partial L_{ij}}= {\bf P^{ij}}$)

And similarly: $\frac{\partial {\bf x^t L^t }}{\partial L_{ij}}= {\bf x^t P^{ji} }$

Then, applying the product rule:

$\frac{\partial { \bf x^t L^t L x }}{\partial L_{ij}}= { \bf x^t L^t} {\bf P^{ij} x} + {\bf x^t P^{ji} L x } = 2 \; { \bf x^t L^t} {\bf P^{ij} x} $

Now the Matrix Cookbook comes to the rescue ("The single-entry matrix is very useful when working with derivatives of expressions involving matrices" - page 52), by using formula 431 (page 53) we get the result:

\frac{\partial { \bf x^t L^t L x }}{\partial L_{ij}}=  2 ({\bf L x x^t })_{ij} \Rightarrow \frac{\partial { \bf x^t L^t L x }}{\partial {\bf L}}= 2 {\bf L x x^t }

BTW, this is formula 69 in the same Cookbook.

Edited: I had messed some indexes, I think it's ok now.

2

Let $f(L) = Lx $ and then from the Wikipedia article on matrix calculus:

$ \frac{\partial ||Lx||^{2}}{\partial L} = 2\cdot{} ||Lx||\cdot{}\frac{\partial}{\partial L}(||f(L)||)$ $ = 2\cdot{}||Lx||\cdot{}\biggl[\frac{\partial}{\partial f}(||f(L)||)\biggr]^{T}\cdot{}\frac{\partial f}{\partial L}$ $ = 2\cdot{} ||Lx|| \cdot{} \biggl[\frac{(Lx)^{T}}{||Lx||}\biggr]^{T}\cdot{}x^{T} $ $ = 2\cdot{}Lxx^{T}.$

Edit -- I updated to include missing transpositions.

1

Another way is to go back to the definition of the derivative. Let us fix $x$, and let $H$ be any $n \times n$ matrix. Then:

$A (L+H,x) = ((L+H)x,(L+H)x) = (Lx,Lx) + 2(Lx,Hx)+(Hx,Hx)$

$A (L+H,x) = A (L,x) +2(Lx,Hx)+o (\|H\|).$

Hence, $\frac{\partial A}{\partial L} (L,x)$ is a linear form which maps a matrix $H$ to $2(Lx,Hx)$. Moreover, you can check that for any vectors $x$ and $y$ in $\mathbb{R}^n$ and any $n \times n$ matrix $M$,

$ (x,My)_{\mathbb{R}^n} = (x y^T,M)_{\mathcal{M}_{n,n} (\mathbb{R})},$

where the former scalar product is taken over $\mathbb{R}^n$ and the later over $\mathcal{M}_{n,n} (\mathbb{R})$. Hence:

$2(Lx,Hx)_{\mathbb{R}^n} = (2Lx x^T,H)_{\mathcal{M}_{n,n} (\mathbb{R})} = \left( \frac{\partial A}{\partial L},H \right)_{\mathcal{M}_{n,n} (\mathbb{R})}.$

0

Define the vector $v=Lx \implies dv=dL\,x$ Write the distance in terms of this new variable.
Then find its differential and gradient. $\eqalign{ A &= \|v\|^2 = v^Tv \cr dA &= 2v^Tdv = 2v^TdL\,x= 2\,{\rm Tr}(xv^TdL) \cr \frac{\partial A}{\partial L} &= 2vx^T = 2Lxx^T \cr }$ Or perhaps the transpose of this, depending on your choice of Layout Convention.