1
$\begingroup$

I have a question like how can we mathematically prove that for a general matrix Conjugate Gradient method will always converge within n steps in exact arithmetic ? where n is the size of the matrix.

  • 0
    That paper Inquest posted is really good2012-11-18

2 Answers 2

3

The CG method, like other Krylov methods, finds a "best" solution to $A x=b$ on successively larger subspaces. In particular, the n'th subspace is $\text{span}(x_0, Ax_0, A^2 x_0, ..., A^n x_0)$, and "best" means the point where the residual is orthogonal to the subspace.

Since you keep adding vectors to your subspace and the full space is finite dimensional, eventually there are only 2 possibilities. Either

1) the subspace fills up the whole space, in which case the "best" solution corresponds to the true solution, or

2) at some point $A^{k+1} x_0 \in \text{span}(x_0, Ax_0, A^2 x_0, ..., A^k x_0)$. This might seem bad, but it's actually a "happy coincidence" since it means $\text{span}(x_0, Ax_0, A^2 x_0, ..., A^k x_0)$ is a direct sum of $k$ eigenspaces. Everything that starts in there will stay in there under application of $A$, and everything orthogonal to it will stay orthogonal. (note that $A$ is symmetric so it's eigenvectors are orthogonal)

If the true solution is in this eigenspace we would have found it, and if it isnt' we can just pick a point orthogonal to the eigenspace and start again, having reduced the problem by $k$ dimensions.

  • 0
    Why would that Krylov space be a direct sum of$k$eigenspaces? Could you explain it in detail? Thanks!2018-12-16
1

When you use CG (or just about any of the common Krylov subspace methods), what you are really doing is solving a certain polynomial approximation problem. Suppose you are trying to solve the problem $Ax = b$ using CG with starting guess $x_0$. Let $x_k$ be the $k^\textrm{th}$ CG iterate, and let $e_k = x - x_k$ be the error at step $k$. Let $\lambda_1, \ldots \lambda_n$ be the eigenvalues of $A$. Then it can be shown that

$ ||e_k||_A = \inf_{p \in P_k} || p_k(A)e_0 ||_A \leq ||e_0||_A \inf_{p \in P_k} \max_{1 \leq j \leq n} |p_k(\lambda_j)|, $

where $P_k$ is the space of polynomials of degree at most $k$ and $||\cdot||_A$ is the vector norm defined by

$ ||v||_A = \sqrt{v^*Av}. $

The error in the CG iteration is thus determined by one's ability to generate polynomials that are "small" on the spectrum of $A$. At the $n^\textrm{th}$ step, we are allowed to draw the optimal polynomial from $P_n$. Since $A$ has at most $n$ distinct eigenvalues, you can choose $p_n$ to have roots at all of the eigenvalues of $A$, which reduces the error to zero.

As a corollary, if $A$ has fewer than $n$ distinct eigenvalues, the iteration will actually converge in fewer than $n$ steps!

See Lecture 38 in Trefethen and Bau's Numerical Linear Algebra for an excellent discussion of this. The document in the link posted by Inquest is very good as well.