2
$\begingroup$

I need to calculate the Jacobian $\frac{df}{dx}$ of $f=A^{-1}b$ where $A$ and $b$ are a function of $x$, the variable towards to differentiate.

I thought

$$\frac{df}{dx} = \frac{dA^{-1}}{dx} b + A^{-1}\frac{db}{dx}$$

by the product rule, and since $A^{-1}A=I$,

$$\frac{dA^{-1}}{dx} = A^{-1} \frac{dA}{dx} A^{-1}.$$

Now the last thing i thought is

$$\frac{dA}{dx} = \frac{dA}{dx_1} + \frac{dA}{dx_2} + \frac{dA}{dx_3} + \cdots $$

The last step is to calculate the Jacobian of a matrix. However, if I try this for a simple example, I get a wrong answer. Can anyone see where I make the mistake? How can I calculate the Jacobian of $f=A^{-1}b$ correct if I cannot analytically invert $A$ (I can only do that numerically)?

  • 0
    The trouble is that $dA^{-1}/dx$ needs to be a kind of rank-3 object that when multiplied by the vector $b$ gives a matrix. You would be better off doing this in some form of [index notation](https://en.wikipedia.org/wiki/Einstein_notation).2012-07-27

1 Answers 1

1

To elaborate on my comment, the problem is that if you want to stay in the usual linear algebra notation, it's not really clear what kind of object the derivative of a matrix $A$ with respect to a vector $x$ really is, or how to multiply it with another vector $b$. You can either redo everything using index notation, or give up a little bit of abstraction and work componentwise, as follows.

Recall that the Jacobian of $f(x)$ is a matrix whose $j$th column is $\partial f/\partial x_j$. Expanding this out, we have $$\frac{\partial f}{\partial x_j} = \frac\partial{\partial x_j}(A^{-1}b) = A^{-1}\frac{\partial b}{\partial x_j} + \frac{\partial A^{-1}}{\partial x_j}b.$$ Here $\partial A^{-1}/\partial x_j$ is the derivative of a matrix with respect to a scalar, so it's still a matrix, and we know how to work with those. Using the fact that $$\frac{\partial A^{-1}}{\partial x_j} = -A^{-1}\frac{\partial A}{\partial x_j}A^{-1}$$ (you have the wrong sign in the equation in your question), we get $$\frac{\partial f}{\partial x_j} = A^{-1}\frac{\partial b}{\partial x_j} - A^{-1}\frac{\partial A}{\partial x_j}A^{-1}b.$$ We can pull this partially back into matrix form, because $\partial b/\partial x_j$ are just the columns of the Jacobian of $b(x)$. However, there's not much we can do about the second term without bringing tensors into the picture. So what we get is $$J_f = A^{-1}J_b - \begin{bmatrix}A^{-1}\frac{\partial A}{\partial x_1}A^{-1}b & A^{-1}\frac{\partial A}{\partial x_2}A^{-1}b & \cdots & A^{-1}\frac{\partial A}{\partial x_n}A^{-1}b\end{bmatrix}.$$

  • 0
    Thanks alot, It is correct when I use my example: syms x1 x2 M=[x1^2 x2^2;sin(x2) cos(x1)] Q=[x1+x2;x1^3+x2^3] Minv=inv(M); dfdx1=jacobian(M\Q,[x1;x2]) dfdx2=Minv*jacobian(Q,[x1;x2])-[Minv*diff(M,x1)*Minv*Q Minv*diff(M,x2)*Minv*Q] a=(dfdx1-dfdx2) x1=rand;x2=rand;eval(a)% (somehow I cannot use the enter key to go to the next line)2012-07-27