11
$\begingroup$

I am trying to understand some theorems whose standard proofs seem to involve extensive matrix manipulations. I'm finding it a bit difficult to see a big picture behind all the matrix algebra, and I'd like to ask two questions which will hopefully help me understand.

Question 1: What geometric interpretations can be given to the Schur complement of a matrix?

To be precise, I need not just any geometric interpretation, but one which will help me make sense of the proofs I am looking at. So here is something more concrete: a theorem about some properties of the Schur complement.

Theorem: Suppose $EA=AF,$ where $E$ is lower triangular and $F$ is upper triangular: $ E = \begin{pmatrix} e & 0 \\ * & E' \end{pmatrix}, ~~~ F = \begin{pmatrix} f & * \\ 0 & F' \end{pmatrix}$ Further, suppose that the $(1,1)$ entry of $A$ is nonzero, so that we can take the Schur complement of A with respect to it, which we will denote by $S$. Then $E'S = S F'.$

This is a special case of Theorem 6 from these lecture notes.

Question 2: It feels like this theorem should have both a geometric interpretation and geometric proof. Can someone provide these?

Thank you!

1 Answers 1

9

I hope I got the right idea what you would consider a "geometric" interpretation.

You can think of Schur complementation as splitting up a linear transform according to how it couples two complementary subspaces. If you represent a vector as the sum of two components in the subspaces, the result of acting on it with the linear transform has four parts: two corresponding to the diagonal blocks of the matrix, where the transform acts on each component separately, and two corresponding to the off-diagonal blocks of the matrix, where the transform mixes the two components. The Schur complement results from splitting the transform up into steps corresponding to this decomposition.

Imagine you're performing these operations on a computer, and you'd like to do everything "in place" -- you want to allow multiplying a vector by a matrix (x *= A) and adding a matrix multiple of a vector to another vector (x += B y), but you don't want to add two matrix multiples of vectors (z = A x + B y). (Here a op= b is short for a = a op b.) Directly applying a transform

$Ax=\begin{pmatrix}P&K\\M&R\end{pmatrix}\begin{pmatrix}y\\z\end{pmatrix}$

would require you to calculate $Py+Kz$ and $My+Rz$. To avoid that, you could first mix the lower component into the upper component (y += K z), then apply the diagonal transforms on the subspaces (y *= P and z *= R), and then mix the upper component into the lower component (z += M y), but the result wouldn't be quite right because you'd be mixing the lower component back into itself and multiplying the mixed components by the diagonal blocks. Correcting for this yields:

$\begin{pmatrix}P&K\\M&R\end{pmatrix} = \begin{pmatrix}I&\\MP^{-1}&I\end{pmatrix} \begin{pmatrix}P&\\&R-MP^{-1}K\end{pmatrix} \begin{pmatrix}I&P^{-1}K\\&I\end{pmatrix} =: LDU \;, $

which decomposes the linear transform into a part $U$ that mixes the lower component into the upper component, a block-diagonal part $D$ that acts on the subspaces separately, and a part $L$ that mixes the upper component into the lower component; the block-diagonal part contains the Schur complement $S=R-MP^{-1}K$.

Now if you have

$EA=AF$

with $E$ lower triangular and $F$ upper triangular, that says that it doesn't matter whether you first mix the lower component into the upper component using $F$ and then apply $A$, or first apply $A$ and then mix the upper component into the lower component using $E$. With our decomposition of $A$, this becomes

$ELDU=LDUF\;.$

Now $E$ and $L$ are both lower diagonal and mix the upper component into the lower component, and $U$ and $F$ are both upper diagonal and mix the lower component into the upper component. Since the inverse of a lower/upper triangular matrix is again lower/upper triangular (the inverse of mixing one component into the other is simply subtracting what was added), multiplying by $L^{-1}$ on the left and $U^{-1}$ on the right collects all the lower diagonal parts and all the upper diagonal parts:

$\left(L^{-1}EL\right)D=D\left(UFU^{-1}\right)\;.$

The key here is that now there's only mixing in one direction on each side, upper into lower on the left and lower into upper on the right, and mixing in only one direction doesn't affect the diagonal blocks, which describe how the transform acts separately on the components. So we can simply equate the lower right diagonal blocks, and since $L$ and $U$ are pure mixing and their diagonal blocks are identities, this yields

E'S=SF'\;.

  • 1
    Thanks for the great answer. Can we also conclude that $(L^{-1}EL)D$ is actually diagonal? This would follow from the equation $(L^{-1} E L) D = D (U F U^{-1})$ - the matrix on the left hand side is (block) lower triangular while the matrix on the right hand side is (block) upper triangular, so both matrices must be block diagonal?2011-08-22