4
$\begingroup$

I was questioned from my friend about a non-proof in a linear algebra textbook.

The author mentions that the following proof of the theorem, that every square matrix in $\Bbb{C}$ is triangularizable, has a gap and thus is invalid:


First, we have two preliminary lemmas.

Lemma 1. Let $I_m$ be the identity matrix of size $m$, $A\in \mathfrak{M}_{m\times m}(\Bbb{C})$, and $U, B \in \mathfrak{M}_{n\times n}(\Bbb{C})$ such that $U$ is invertible. Then $ \bigg(\begin{matrix} I_m & 0 \\ 0 & U^{-1} \end{matrix}\bigg) \bigg( \begin{matrix} A & * \\ 0 & B \end{matrix} \bigg) \bigg( \begin{matrix} I_m & 0 \\ 0 & U \end{matrix} \bigg) = \bigg( \begin{matrix} A & * \\ 0 & U^{-1}BU \end{matrix} \bigg). $

Lemma 2. Let $A \in \mathfrak{M}_{n\times n}(\Bbb{C})$. Then there exists an eigenvalue $\lambda \in \Bbb{C}$ and an invertible matrix $U \in \mathfrak{M}_{n\times n}(\Bbb{C})$ such that $ U^{-1}AU = \bigg( \begin{matrix} \lambda & * \\ 0 & * \end{matrix} \bigg).$

Proof. Let $v_1 \in \Bbb{C}^n$ be a nonzero eigenvector corresponding to $\lambda$. Let $\{v_1, \cdots, v_n\}$ be a basis of $\Bbb{C}^n$ containing $v_1$. Let $U$ be a matrix whose $k$-th column is $v_k$. Then $[AU]_{*1} = A[U]_{*1} = Av_1 = \lambda v_1 = \lambda [U]_{*1}$ and hence $ [U^{-1}AU]_{i1} = [U^{-1}]_{ij}[AU]_{j1} = \lambda [U^{-1}]_{ij}[U]_{j1} = \lambda \delta_{i1},$ which proves Lemma 2.

Now we claim that for each $1 \leq k \leq n$ there exist

  • an upper-triangular matrix $T_k$ of size $k$,
  • a square matrix $A_{n-k}$ of size $n-k$, and
  • an invertible matrix $U_k \in \mathfrak{M}_{n\times n}(\Bbb{C})$

such that

$ U_k^{-1} A U_k = \bigg( \begin{matrix} T_k & * \\ 0 & A_{n-k} \end{matrix} \bigg).$

In particular, for $k = n$ this implies that $A$ is triangularizable.

We prove this claim by induction on $k$. For $k = 1$, this is a trivial consequence of Lemma 2. Now assume this holds for $k < n$. Then applying Lemma 2 to $A_{n-k}$, we obtain some $\tilde{\lambda} \in \Bbb{C}$ and an invertible matrix $\tilde{U}$ of size $n-k$ such that

$ \tilde{U}^{-1}A_{n-k}\tilde{U} = \bigg( \begin{matrix} \tilde{\lambda} & * \\ 0 & A_{n-k-1} \end{matrix} \bigg).$

Now let $U_{k+1}$ as

$U_{k+1} = U_k \bigg( \begin{matrix} I_k & 0 \\ 0 & \tilde{U} \end{matrix} \bigg). $

Then

$\begin{align*} U_{k+1}^{-1} A U_{k+1} &= \bigg( \begin{matrix} I_k & 0 \\ 0 & \tilde{U}^{-1} \end{matrix} \bigg) U_k^{-1} A U_k \bigg( \begin{matrix} I_k & 0 \\ 0 & \tilde{U} \end{matrix} \bigg) = \bigg( \begin{matrix} I_k & 0 \\ 0 & \tilde{U}^{-1} \end{matrix} \bigg) \bigg( \begin{matrix} T_k & * \\ 0 & A_{n-k} \end{matrix} \bigg) \bigg( \begin{matrix} I_k & 0 \\ 0 & \tilde{U} \end{matrix} \bigg) \\ &= \begin{pmatrix} T_k & * \\ 0 & \tilde{U}^{-1}A_{n-k}\tilde{U} \end{pmatrix} = \begin{pmatrix} T_k & * & * \\ 0 & \tilde{\lambda} & * \\ 0 & 0 & A_{n-k-1} \end{pmatrix} =: \begin{pmatrix} T_{k+1} & * \\ 0 & A_{n-k-1} \end{pmatrix}. \end{align*}$

Therefore the claim follows by induction and the proof is complete. Q.E.D.


Actually, I was convinced by this argument and was unable to find any gap in this proof.

The book says that this non-proof overlooked the importance of $L$-invariant subspaces. So I tried this non-proof with some examples of $A$ where the multiplicity of an eigenvalue $\lambda$ is greater than the dimension of the kernel of $A - \lambda I$ in hopes that these trials would reveal the fault in the argument above. But it ended up only finding that the algorithm above works in those cases. I want to know what I am missing.

  • 0
    Yes, okay, but what is $L$? I don't see you refer to an operator named $L$ in this proof. (I would have just asked "what is $L$" but I wasn't sure if "$L$-invariance" had some other definition unfamiliar to me.)2012-08-17

1 Answers 1

7

I can't find a gap in the proof either, but let me just comment that the proof works far too hard. As Emil Artin once said,

It is my experience that proofs involving matrices can be shortened by 50% if one throws the matrices out.

I think this proof gets shortened even more than that. Let $A : V \to V$ be a linear operator on an $n$-dimensional complex vector space $V$. We will define inductively a basis $v_1, ... v_n$ of $V$ such that $A$ preserves $V_k = \text{span}(v_1, ... v_k)$. This is equivalent to $A$ being upper-triangular with respect to this basis. Define $v_1$ to be an eigenvector of $A$ (say with eigenvalue $\lambda_1$). If $v_1, ... v_k$ have been defined, then $A$ preserves $V_k$, hence acts on $V/V_k$, hence has some eigenvector there (this is the crucial step!). Let $v_{k+1}$ be any element of $V$ mapping to such an eigenvector (say with eigenvalue $\lambda_{k+1}$). Then $A v_{k+1} \equiv \lambda_{k+1} v_{k+1} \bmod V_k$, hence $A$ preserves $V_{k+1}$. The conclusion follows.

(The subspaces $V_k$ form a complete flag, and "preserving a complete flag" is the basis-invariant way to say "upper-triangular.")

  • 0
    @QiaochuYuan, I'm relieved that my logical system embedded in my brain is still working fine. Thank you both for your confirmation and for a nice concise proof!2012-08-18