2
$\begingroup$

Suppose i have a square matrix $A_{ij}$ and a test matrix $M_{ij}$. I want to find the integral curve that joins $M_{ij}$ to $A^{-1}_{ij}$ Following the gradient descent path of the following cost function:

$ F_{A}(M) := \sum_{ij} [( M A - I )_{ij}]^2 $

Of course, the curve does not always end in to $A^{-1}$ because sometimes $M$ lies in a local-minima basin (i.e. not connected to the global minima).

Locally, the gradient descent path is described by the gradient of $F_{A}$

$ \nabla_{M} F_{A} = 2 (M A - I ) A^{T} $

or in component notation (trying very hard to not use Einstein summation notation), looks like:

$ \frac{\partial F_{A}^{(ij)} }{ \partial M_{xy}} = 2 \sum_{k} ( \sum_{p} (M_{ip} A_{pk}) - \delta_{ik} ) A_{jk} $

now, it is pretty clear from the above expression that its third derivative is zero.

So if i try to compute the gradient descent path to higher orders, it seems that i'm left with a quadratic curve with no further corrections.

Question: what is wrong with this approach to obtain a matrix inverse? It seems strange to me that the gradient descent curve is a simple polynomial form. Maybe i'm doing something wrong? I feel like i'm missing something very obvious, but i can't think right now what is it.

1 Answers 1

2

Your objective function is convex, so any local minimum is a global minimum. If $A$ is nonsingular then there is only the local minimum at $A^{-1}$.

Given the singular value decomposition $A=U\Sigma V^T$, let $M=V X U^T$. So then the objective becomes:

$ \begin{eqnarray} \| MA - I \|_{\mbox{fro}}^2 &=& \mbox{Trace}((MA - I)^T(MA - I)) \\ &=& \mbox{Trace}((X \Sigma - I)^T( X \Sigma - I))\\ &=& \sum_{ij} (\sigma_j x_{ij} - \delta_{ij})^2 \end{eqnarray} $

  • 0
    The point of introducing the [SVD](http://en.wikipedia.org/wiki/Singular_value_decomposition) is that if you rotate the coordinate system, your objective is a function of the form $f(x)= \sum_i (s_i x_i - d_i)^2$. You can explicitly compute the descent curve of such a function. $dx/dt=-\nabla f \Rightarrow x_i(t) = d_i/s_i + e^{-2s_i^2 t} (x_i(0) - d_i/s_i)$2011-11-07