3
$\begingroup$

Theorem: There is a unitary matrix $M=U$ such that $U^{-1}AU=T$ is triangular. The eigenvalues of $A$ appear along the diagonal of this similar matrix $T$.

Proof)
Every matrix, for example 4 by 4, has at least one eigenvalue $\lambda_1$. Therefore $A$ has at least one unit eigenvector $x_1$, which we place in the first column of $U$. At this stage the other three columns are impossible to determine, so we complete the matrix in any way that leaves it unitary and call it $U_1$. (The gram-schmidt process guarantees that this can be done) $Ax_1=\lambda_1x_1$ in column 1 means that the product $U_1^{-1}AU_1$ starts in the right form.

$AU_1=U_1\begin{pmatrix} \lambda_1 & * & * & * \\ 0 & * & * & * \\ 0 & * & * & * \\ 0 & * & * & * \end{pmatrix}$ leads to $U_1^{-1}AU_1=\begin{pmatrix} \lambda_1 & * & * & * \\ 0 & * & * & * \\ 0 & * & * & * \\ 0 & * & * & * \end{pmatrix}$

Now work with the 3 by 3 submatrix which has a unit eigenvector $x_2$ and it becomes the first column of a unitary matrix $M_2$.

If $U_2=\begin{pmatrix} 1 & 0 & 0 & 0 \\ 0 & & & \\ 0 & & M_2 & \\ 0 & & & \end{pmatrix}$ then $U_2^{-1}(U_1^{-1}AU_1)U_2=\begin{pmatrix} 1 & * & * & * \\ 0 & \lambda_2 & * & * \\ 0 & 0 & * & * \\ 0 & 0 & * & * \end{pmatrix}$

In here, I can't follow the logic.
Why $AU_1=U\begin{pmatrix} \lambda_1 & * & * & * \\ 0 & * & * & * \\ 0 & * & * & * \\ 0 & * & * & * \end{pmatrix}$ ? It used the formula $Ax_1=\lambda_1x_1$?
Also, why $U_2^{-1}(U_1^{-1}AU_1)U_2=\begin{pmatrix} 1 & * & * & * \\ 0 & \lambda_2 & * & * \\ 0 & 0 & * & * \\ 0 & 0 & * & * \end{pmatrix}$ ?
I even don't know what's going on here...
Can you explain the process through which we can get the final matrix T?

  • 0
    You're right. The theorem is exactly "schur's lemma", but it's a bit hard to understand for me.2012-11-24

2 Answers 2

1

For the first equality, you have $U_1=[x_1\ x_2\ x_3\ x_4]$, so $ U_1^{-1}AU_1=\begin{bmatrix}x_1^T\\ \vdots \\ x_4^T\end{bmatrix}[Ax_1\ Ax_2\ Ax_3\ Ax_4]=\begin{bmatrix}x_1^T\\ \vdots \\ x_4^T\end{bmatrix}[\lambda_1x_1\ Ax_2\ Ax_3\ Ax_4]=\begin{bmatrix}\lambda_1x_1^Tx_1&*&*&*\\ \lambda_1x_2^Tx_1&*&*&* \\ \lambda_1x_3^Tx_1&*&*&*\\ \lambda_1 x_4^Tx_1&*&*&*\end{bmatrix}=\begin{bmatrix}\lambda_1&*&*&*\\ 0&*&*&* \\ 0&*&*&*\\ 0&*&*&*\end{bmatrix}, $ using that $x_k^Tx_j=\delta_{k,j}$. Next you start with the new matrix you got at the end, and look at the $3\times3$ submatrix $A_2$ you obtain by removing the first row and column. It has an eigenvalue, so you can repeat the process we did above to get a $3\times3$ unitary $V_2$ with $V_2^{-1}A_2V_2=\begin{bmatrix}\lambda_2&*&*\\0&*&*\\0&*&*\end{bmatrix}.$ Now you put $U_2=\begin{bmatrix}1\\ & V_2\end{bmatrix}$, and so $ (U_1U_2)^{-1}A(U_1U_2)=U_2^{-1}(U_1^{-1}AU_1)U_2=\begin{bmatrix}\lambda_1&*&*&*\\ 0&\lambda_2&*&*\\ 0&0&*&*\\0&0&*&*\end{bmatrix}. $ After one more step working now on the $2\times2$ submatrix obtained from the last two rows and columns, you get the desired form.

1

IMHO you need only some facts on block matrix multiplication to get this straight.

Trick One

\begin{align} \begin{bmatrix} \mathbf{x}^{T}_{1\times N}& \\ \mathbf{U}_{N-1\times N}^{T} \end{bmatrix}\mathbf{A}_{N\times N}\begin{bmatrix} \mathbf{x}^{}_{N\times 1}& \mathbf{U}_{N\times N-1}^{} \end{bmatrix} &= \begin{bmatrix} \mathbf{x}^{T}_{}& \\ \mathbf{U}_{}^{T} \end{bmatrix}\begin{bmatrix} \mathbf{A}\mathbf{x}^{}_{}& \mathbf{A}\mathbf{U}_{}^{} \end{bmatrix} \\ &= \begin{bmatrix}\mathbf{x^{T}Ax}&\mathbf{x^{T}AU}\\\mathbf{U^{T}Ax}& \mathbf{U^{T}AU}\end{bmatrix} \end{align}

Trick Two \begin{align} &\begin{bmatrix}{1}&\mathbf{0^{T}_{}}\\\mathbf{0}_{}& \mathbf{M}_{}^{T}\end{bmatrix}\begin{bmatrix}{\lambda_1}&\mathbf{b^{T}_{}}\\\mathbf{0}_{}& \mathbf{A_1}_{}\end{bmatrix}\begin{bmatrix}{1}&\mathbf{0^{T}_{}}\\\mathbf{0}_{}& \mathbf{M}_{}^{}\end{bmatrix} \\ \\ &= \begin{bmatrix}{\lambda_1}&\mathbf{b^{T}_{}}\\\mathbf{0}_{}& \mathbf{M^{T}A_1}_{}\end{bmatrix}\begin{bmatrix}{1}&\mathbf{0^{T}_{}}\\\mathbf{0}_{}& \mathbf{M}_{}^{}\end{bmatrix} \\ &= \begin{bmatrix}{\lambda_1}&\mathbf{b^{T}_{}M}\\\mathbf{0}_{}& \mathbf{M^{T}A_1M}_{}\end{bmatrix} \end{align}

Trick One helps you to start schur decomposition. Then combining Trick one and Trick two in consequtive steps helps you to continue in inductive manner (in fact, it is induction) to reduce the matrix further and further till the point it becomes triangular. For eg, lets say, we performed trick one first and then trick two which is exactly your situation now. Lets say $\mathbf{A_2=M^{T}A_1M}$ and $\mathbf{b_2=b^TM}$. So your new matrix becomes \begin{align} \begin{bmatrix}\lambda_1 & \mathbf{b_2} \\ \mathbf{0}& \mathbf{A_2}\end{bmatrix} \end{align} Now you combine Trick One and Trick two. Let us say know a matrix $\mathbf{U_2}$ (is trick one) such that \begin{align} \mathbf{U_2^{T}}\mathbf{A_2U_2}= \begin{bmatrix}\lambda_2 & \mathbf{b_3} \\ \mathbf{0}& \mathbf{A_2}\end{bmatrix} \end{align} Now you use trick two (appropriate dimensions) matrix. In that replace $\mathbf{M}$ with $\mathbf{U_2}$ and continue further.