0
$\begingroup$

I was wondering, if I have the derivative of $\boldsymbol{x}^{T}\boldsymbol{A}$ with respect to x where A does not depend on x and A is a matrix and x a vector I have:

$$\frac{\partial}{\partial \boldsymbol{x}}{\boldsymbol{x}^{T}\boldsymbol{A}} = \boldsymbol{A}^{T}$$

if we are using the "numerator layout", the more standard, I think, https://en.wikipedia.org/wiki/Matrix_calculus#Layout_conventions

But if I write the same in tensor notation (following the Einstein criterion of summation in repeated indices), I get:

$$\frac{\partial}{\partial x_{l}} x_{i}A_{ij} = A_{lj}$$

and $A_{lj}$ is $\boldsymbol{A}$ not $\boldsymbol{A^{T}}$ because the $j$ index remains in its original position. What am I doing wrong here?

ADDED: I think its the same when I have $$\frac{\partial}{\partial \boldsymbol{x}}{\boldsymbol{x}^{T}\boldsymbol{A}\boldsymbol{x}} = \boldsymbol{x^{T}}(\boldsymbol{A}^{T} + \boldsymbol{A})$$

but with tensorial index notation:

$$\frac{\partial}{\partial x_{l}} x_{i}A_{ij}x_{j} = A_{lj}x_{j} + A_{il}x_{i} = A_{lj}x_{j} + x_{i}A_{il} = \boldsymbol{A}\boldsymbol{x} + \boldsymbol{x^{T}}\boldsymbol{A}$$

but the correct one is the previous. I'm confused. Thank you in advance.

1 Answers 1

0

I have finally solved. Posting here the solution in case it can help. The key is noticing that if we are using the common "numerator layout convention", a vector is a column, so the $l$ in

$$\frac{\partial}{\partial x_{l}} x_{i}A_{ij} = A_{lj}$$

is a column index, so the $A_{lj}$ has to be a transposed matrix. This can also be applied to the added example, where $A_{lj}x_{j}$ is $\boldsymbol{x}^{T}\boldsymbol{A^{T}}$ and $x_{i}A_{il}$ is $\boldsymbol{x}^{T}\boldsymbol{A}$.