8
$\begingroup$

I asked this on SO first, but decided to move the math part of my question here.

Consider a $p \times p$ symmetric and positive definite matrix $\bf A$ (p=70000, i.e. $\bf A$ is roughly 40 GB using 8-byte doubles). We want to calculate the first three diagonal elements of the inverse matrix $({\bf A^{-1}})_{11}$, $({\bf A^{-1}})_{22}$ and $({\bf A^{-1}})_{33}$.

I have found this paper by James R. Bunch who seems to solve this exact problem without calculating the full inverse $\bf A^{-1}$. If I understand it correctly he first calculates the Cholesky decomposition, i.e. the upper triangular matrix $\bf R$ which satisfies $\bf A=R^T R$, which needs $\frac16p^2+\frac12p^2-\frac23p$   floating point operations (multiplications/divisions) using the LINPACK function SPOFA. He then proceeds to calculate individual diagonal elements of the inverse $({\bf A^{-1}})_{ii}$ using an expression which exploits the sparsity of ${\bf R}^T{\bf y}={\bf e}_j$   and which requires $\frac12(p-i)^2+\frac52(p-i)+2$   floating point operations. (I don't understand the full details of this, so I can't currently sum it up correctly).

The paper is based on LINPACK; it isn't cited by anyone, so it seems nobody cared for the last 23 years? After reading this, I'm wondering whether this is still the best way of doing things, or whether a modern LAPACK-based approach could avoid the Cholesky decomposition?

In short, is there a quicker way to calculate those diagonal elements of the inverse?

  • 0
    I don't have time to think it through right now, but [this blog comment](http://www.johndcook.com/blog/2010/01/19/dont-invert-that-matrix/#comment-56856) and [the](http://www.johndcook.com/blog/2010/01/19/dont-invert-that-matrix/#comment-31019) [comments](http://www.johndcook.com/blog/2010/01/19/dont-invert-that-matrix/#comment-31197) [it](http://www.johndcook.com/blog/2010/01/19/dont-invert-that-matrix/#comment-31202) [refers](http://www.johndcook.com/blog/2010/01/19/dont-invert-that-matrix/#comment-33080) to might be relevant here.2012-02-02

5 Answers 5

0

Here's a page and package solving this exact problem. It does use the Cholesky decomposition, but note that the Cholesky decomposition of a large sparse matrix can be computed quickly by a package such as CHOLMOD.

The papers it cites are:

  • Takahashi K, Fagan J, and Chen M-S (1973). Formation of a sparse bus impedance matrix and its application to short circuit study. In Power Industry Computer Application Conference Proceedings. IEEE Power Engineering Society.

  • Vanhatalo J and Vehtari A (2008). Modelling local and global phenomena with sparse Gaussian processes. In Proceedings of the 24th Conference in Uncertainty in Artificial Intelligence. AU AI Press.

5

I suggest you look at the book Matrices, Moments, and Quadrature with Applications. The basic idea is that you have a $n \times n$ positive semi-definite matrix $A$, and you wish to compute the quadratic form $u^T f(A)u$, where $u \in \mathbb{R}^n$. In your case you wish to compute $ A^{-1}_{ii} = e_i^T f(A)e_i, $ where $e_i \in \mathbb{R}^n$ is a vector of all zeros, with a 1 in the $i$th position, and the function $f$ is defined as $f(\lambda) = \lambda^{-1}$.

You can turn the quadratic form into an Stieltjes integral, since $ u^T f(A)u = u^T Q^T f(\Lambda) Qu = \alpha^T f(\Lambda) \alpha = \sum_{i=1}^n f(\lambda_i) \alpha_i^2 = \int_a^b f(\lambda) d\alpha(\lambda), $ where the eigenvalue decomposition of $A$ is given by $A = Q\Lambda Q^T$ and $\Lambda$ is a diagonal matrix containing the eigenvalues of $A$, and the vector $\alpha = Qu$. The integral can be estimated using Gaussian quadrature via $ I[f] = \int_a^b f(\lambda) d \alpha(\lambda) = \sum_{i=1}^N \omega_i f(t_i) + R[f], $ where $\{\omega_i\}_{i=1}^N$ are a set of weights, and $\{t_i\}_{i=1}^N$ are a set of nodes at which to evaluate the function $f$, and $R[f]$ is a remainder or error term. The values of the $\omega$'s and $t$'s are unknown and must be solved for. The values for the weights may be computed via an iterative algorithm similar to the Lanczos algorithm for computing eigenvalues. The values of the nodes may be obtained from components of an eigenvector of a matrix derived from $A$. This computation may be done efficiently. For more details see the book, as well as this lecture on the topic by James Lambers.

The underlying mathematics and linear algebra may seem a little scary at first, but I assure you this leads to a fairly simple and efficient algorithm. I wrote Matlab code to calculate the weights and nodes for a class in graduate school. It wasn't very difficult. Take a look at the book. Good luck.

  • 0
    The iterative algorithm works by computing matrix vector products $Ax$. You can compute lower and upper bounds on $A^{-1}_{ii}$ that converge to the correct solution. They often converge in very few iterations. See [this paper](http://www.stanford.edu/class/cme335/sccm93-07.pdf) (from the class I mentioned) for more detail and numerical examples. This method is probably best if your matrix is large and sparse.2011-09-15
3

I suggested that a partial Cholesky decomposition should be possible here, but based on a comment I received, I started thinking about my answer again and realised that my suggestion was incorrect. Apologies if I lead anyone astray.


You will need to use a full Cholesky decomposition, but the problem to deduce the inverse can be reduced in scale to save redundant computation.

If your SPD matrix $\mathbf{A}$ is written in block form as

$\mathbf{A}=\begin{bmatrix} \mathbf{A}_{00} & \mathbf{A}_{10}^T \\ \mathbf{A}_{10} & \mathbf{A}_{11} \end{bmatrix}$

with $\mathbf{A}_{00}$ being a $3\times3$ SPD block, its inverse will be given by

$\mathbf{A}^{-1}=\begin{bmatrix} \mathbf{B}_{00} & \mathbf{B}_{10}^T \\ \mathbf{B}_{10} & \mathbf{B}_{11} \end{bmatrix}$

The equivalent Cholesky decomposition of $\mathbf{A}$ is then given by

$\mathbf{R}=\begin{bmatrix} \mathbf{R}_{00} & \\ \mathbf{R}_{10} & \mathbf{R}_{11} \end{bmatrix}$

The resulting matrix equation to solve for the inverse block matrix $\mathbf{B}_{00}$ can then be reduced to

$\begin{bmatrix} \mathbf{R}_{00} & \\ \mathbf{R}_{10} & \mathbf{R}_{11} \end{bmatrix} \begin{bmatrix} \mathbf{R}_{00}^T & \mathbf{R}_{10}^T\\ & \mathbf{R}_{11}^T \end{bmatrix} \begin{bmatrix} \mathbf{B}_{00} \\ \mathbf{B}_{10} \end{bmatrix}=\begin{bmatrix}\mathbf{I}_{00} \\ \mathbf{0} \end{bmatrix}$

and solved via

$\begin{bmatrix} \mathbf{R}_{00} & \\ \mathbf{R}_{10} & \mathbf{R}_{11} \end{bmatrix} \begin{bmatrix} \mathbf{X}_{0} \\ \mathbf{X}_{1} \end{bmatrix} =\begin{bmatrix}\mathbf{I}_{00} \\ \mathbf{0} \end{bmatrix}$

and

$\begin{bmatrix} \mathbf{R}_{00}^T & \mathbf{R}_{10}^T\\ & \mathbf{R}_{11}^T \end{bmatrix} \begin{bmatrix} \mathbf{B}_{00} \\ \mathbf{B}_{10} \end{bmatrix}=\begin{bmatrix}\mathbf{X}_{0} \\ \mathbf{X}_{1} \end{bmatrix}$

The solution you are looking for will be the diagonal entries of $\mathbf{B}_{00}$. The right hand side of this expression is only $n\times 3$, and this combined with lower triangular structure of $\mathbf{R}$ should be about the most efficient way to get the solution you require.

  • 0
    @talonmies: hi, thanks as well. No apologies required...2011-09-15
1

The first part here below is not an "answer" but an extended comment/followup-question to talonmies' previous answer/comment

I've extended your notation a bit and got a clearer exposition: First let us extend the X-matrix to be the inverse of the full R-matrix, such that $\small \mathbf A^{-1}= \mathbf B =\mathbf X^\tau \mathbf X $ and let us also extend the indexes for X , then
$\qquad \small \mathbf X_{00} =\mathbf R_{00}^{-1} $, $\small \qquad \mathbf X_{11} =\mathbf R_{11}^{-1} \quad $ but unfortunately $\small \mathbf X_{10} \ne \mathbf R_{10}^{-1} $ .

By the inverse-relation $\small \mathbf X =\mathbf R^{-1} $ we have, that $\small \mathbf X_{10} \mathbf R_{00}+ \mathbf X_{11} \mathbf R_{10} =\mathbf 0 $ and thus $\small \mathbf X_{10} = - \mathbf X_{11} \mathbf R_{10} \mathbf R_{00}^{-1} $ (from which we want to receive the top-left diagonal elements from B by $\small \mathbf X^\tau_{00} \mathbf X_{00} + \mathbf X^\tau_{10} \mathbf X_{10} = \mathbf B_{00} $ )

Here $\small \mathbf X_{11} $ represents the part of the problem for which we want to reduce the computational effort, because this is the inversion of the (huge) remaining part of the cholesky-factor.I thought to introduce the concept of the pseudoinverse pinv(X) by which we could proceed:

$\small \begin{array} {rcl} \mathbf Y &=& - ( \mathbf X_{11} \mathbf R_{10} \mathbf R_{00}^{-1} )^{-1} \\ &=& - \mathbf R_{00} \operatorname{pinv} (\mathbf R_{10}) \mathbf R_{11} \\ \mathbf X_{10} &=& \operatorname{pinv} ( \mathbf Y) \end{array}$

This would allow to invert small matrices only, but although the pseudoinverse-operations works well in many cases I could not make it work correctly for the pinv(R10)-part; for all variants of my computations I always needed the full version R11 in this or that version. Do you see any general reason, why the pinv(R10) is not working sufficiently here?


A practical computation
Hmm, as far as we accept that we need the full cholesky-decomposition anyway, the process might even be described shorter:

a) consider the symmetric SPD-matrix A , we look for the diagonal-elements of the 3x3-submatrix $\small \mathbf B_{00} $ of $ \small \mathbf B = \mathbf A^{-1} $

b) perform the cholesky-decomposition from bottom up into the matrix R; denote $\small \mathbf R_{00} $ the top left 3x3 upper triangular submatrix (thinking in terms of a correlation-matrix A this represents the partial or "unexplained" variance/covariance)

c) invert $\small \mathbf R_{00} $ to get $\small \mathbf X_{00} $ (this is also cheap because it's already triangular)

d) in $ \small \mathbf B_{00} = \mathbf X_{00}^\tau \mathbf X_{00} $ we find the required diagonal elements. (This last operation can even be replaced by a simple summing of squares along the columns in X )


Here is a complete example usable for Pari/GP. To have it as clear as possible no optimizations, errorchecks etc are done:

 Crop(M,dim) = matrix(dim,dim,r,c,M[r,c])  \\ reduces size of a matrix   \\ procedure to show the first dim diagonal-entries of  A^-1 {invdiag(A,dim=3)=local(rs=rows(A),lV,R,X);   \\ reduce A by a cholesky-process to a dim x dim-residual matrix A_00  \\ the cholesky-process goes bottom-up  forstep( d = rs, dim+1, -1,             lV = A[,d]/sqrt(A[d,d]);            A = Crop( A - lV*lV~ , d-1 );          );   \\after this A is the dim x dim residual-matrix     \\ compute the cholesky-factor of that A-residual into matrix R   \\ we just proceed working bottom up, but A is no more cropped now   R = matrix(dim,dim);    forstep( d = dim, 1, -1,      R[,d] = A[,d]/sqrt(A[d,d]);     A = A - R[,d] * R[,d]~;     );     X = R^-1; \\ inverse of R of size dim x dim              \\   and to extract values of diag(A^-1) =diag( R^-1~ * R^-1 )              \\ it suffices to sum the squares along the columns    lv = vector(dim,c,sum(r=1,c, X[r,c]^2) );    return(lv);  }  \\ create a sample matrix A A = matrix(8,8,r,c,binomial(r-1,c-1)/2^(c-1)) A = A* A~  \\ make it a symmetric positive definite one  print(invdiag(A,3)) \\ show the first 3 results by the partial cholesky-method print(diag(A^-1))   \\ show the true results by inversion of the full matrix  \\ output:  [21845.0000000, 980612.000000, 8259152.00000]  [21845, 980612, 8259152, 21815360, 21017856, 7373824, 806912, 16384]~   
1

An alternative could be the (preconditioned) conjugate gradient method for solving Ax = b with b = (1,0,0,...). Then $(A^{-1})_{11}$ is the first entry of x