75
$\begingroup$

Two $n\times n$ matrices $A, B$ are said to be simultaneously diagonalizable if there is a nonsingular matrix $S$ such that both $S^{-1}AS$ and $S^{-1}BS$ are diagonal matrices.

a.) Show that simultaneously diagonalizable matrices commute: $AB = BA$.

b.) Prove that the converse is valid, provided that one of the matrices has no multiple eigenvalues. Is every pair of commuting matrices simultaneously diagonalizable?

My attempt:

a.) Let $M=S^{-1}AS$ and $P=S^{-1}BS$, then it follows that $A= S^{-1}MS$ and $B=S^{-1}PS$. Thus $AB =S^{-1}MSS^{-1}PS = S^{-1}MPS = S^{-1}PMS$ (is it wrong that I switched MP to PM?) $=S^{-1}PSS^{-1}MS=BA$.

b.) How can I do this?

  • 3
    "is it wrong that I switched MP to PM?" Do diagonal matrices commute? Careful, $M=S^{-1}AS$ implies $A=SMS^{-1}$, not $S^{-1}MS$. You should say that $M$ and $P$ are diagonal. For the second part of (b), I would first ask: Is every matrix diagonalizable? The first part of (b) can be deduced from http://math.stackexchange.com/questions/46544/why-does-a-diagonalization-of-a-matrix-b-with-the-basis-of-a-commuting-matrix-a2012-11-13
  • 0
    All diagonal matrices do commute. And that was a mistake on my part I should have $M=SAS^{-1}$, thank you for catching that! And no , not all matrices are diagonalizable. And thank you very much for that link! I am wondering how were you able to obtain that link so easily? Did you just type in matrix diagonalizable in the search box?2012-11-13
  • 0
    I remembered the question (I was one of the answerers), so I knew roughly what to look for, and actually found it through a related question. You'll find other closely related questions using the links under "Linked" in the column on the right of the page.2012-11-13
  • 0
    @JonasMeyer oh wow, I did not know of that feature. That is very convenient. Thank you very much!2012-11-13
  • 0
    Regarding the second part of (b), since it can't be true as asked, we could modify it to a more interesting question: If two *diagonalizable* matrices commute, are they simultaneously diagonalizable? That is (subjectively) more difficult than the case where one is assumed to have $n$ distinct eigenvalues.2012-11-13
  • 0
    @JonasMeyer Hm are you asking that two matrices, A and B, commute iff they are simultaneously diagonalizable, then yes. But I must admit I cheated on this because I looked it up prior to your question. Can you elaborate more on the logic behind that?2012-11-13
  • 0
    diimension: No: If $A$ and $B$ are not diagonalizable, then they are not simultaneously diagonalizable. If you add the hypothesis that $A$ and $B$ are diagonalizable, it is a new question that basically extends the first part of (b). I do not know what logic you want elaborated.2012-11-13
  • 0
    @JonasMeyer I misread your comment, sry about that. Thank you again for clearing this up for me!2012-11-13
  • 0
    For part b, I found [this](http://www.math.lsa.umich.edu/~tfylam/Math217/proofs10-sol.pdf) to be helpful.2016-04-12
  • 0
    Do we really need "one of the matrices has no multiple eigenvalues"? According to Horn & Johnson 1985, pp. 51–53, we don't need it.2017-06-24

1 Answers 1

142

This has undoubtedly been answered (likely multiple times) here before, so I post this at the risk of beating a dead (and decaying) horse.

Let me first link you to this page, which contains two excellent answers (I particularly recommend Keith Conrad's expository paper linked in Pierre-Yves Gaillard's answer). However, let me provide a perhaps more elementary viewpoint since, from experience, many people beginning this topic are not quite comfortable with minimal polynomial based arguments yet.

You seem to have covered part a quite adequately so let me focus on part b. I apologize in advance for the length, but I feel that this is a topic which requires thorough understanding.

The main thing to remember about commuting matrices is the fact that commuting matrices respect each other's eigenspaces. What does this mean? To talk about that, we first have to introduce the topic of an invariant subspace.

Consider a matrix mapping $A:\ V \rightarrow V$ for a vector space $V$. If there is some subspace $U$ of $V$ such that the restriction of $A$ to $U$ remains an operator in the sense that $A:\ U\rightarrow U$, then we say that $U$ is an invariant subspace of $A$. The term stable is also sometimes used. The significance of this is that $A(U) \subseteq U$, the image of $U$ is entirely contained within $U$. This way, it makes sense to talk about a restriction of the mapping to the smaller vector space $U$.

This is desirable for several reasons, the main one being that linear mappings on smaller vector spaces are easier to analyze. We can look at the action of the mapping on each invariant subspace and then piece them together to get an overall picture. This is what diagonalization does; we break down the vector space into smaller invariant subspaces, the eigenspaces, and then piece together the facts to get a simpler picture of how the mapping works. Many of the simpler, canonical representations are dependent on this fact (for example, the Jordan canonical form looks at the invariant generalized eigenspaces).

Now, if we have two commuting, diagonalizable matrices, then each eigenspace of $B$ is not only invariant under $B$ itself, but also under $A$. This is what we mean by preserving each other's eigenspaces. To see this, let $\mathbf{v}$ be an eigenvector of $B$ under eigenvalue $\lambda$. Then $$B(A\mathbf{v}) = A(B\mathbf{v}) = \lambda A\mathbf{v}$$ so that $A\mathbf{v}$ is again an eigenvector of $B$ under eigenvalue $\lambda$. In our new language, this means that the eigenspace $E_\lambda$ of $B$ is invariant under $A$. This means it makes sense to look at the restriction of $A$ to $E_\lambda$.

Now consider the restriction of $A$ to $E_\lambda$. If all the eigenvalues of $B$ are simple (multiplicity one) then that means each eigenspace of $B$ is one dimensional. We have therefore restricted $A:\ E_\lambda \rightarrow E_\lambda$ to a mapping on a one-dimensional vector space. But this means that $A$ must take each vector of $E_\lambda$ to a scalar multiple of itself. You can check that this necessarily implies that $E_\lambda$ is also an eigenspace of $A$. Therefore, for any eigenbasis of $B$ that we take, the corresponding vectors also form an eigenbasis of $A$. This means that the two matrices are simultaneously diagonalizable; they share a common eigenbasis.

The general case is a bit more involved in that the restrictions to the invariant subspaces are more complex (they're no longer one-dimensional), but the ideas are identical.

P.S. Since you seem to be interested in physics, let me mention a crucial application of commuting operators. In quantum mechanics, you have quantities called observables, each of which is roughly speaking represented by a Hermitian matrix. Unlike in classical physics, different observables need not be simultaneously measurable (by measuring position for example, you cannot simultaneously measure momentum and vice versa) which is ultimately due to the fact that the position operator and the momentum operator do not commute (this is the underlying reasons behind the uncertainty principle). They do not have a shared basis which can represent the states of a system. Commuting operators therefore form a key element of quantum physics in that they define quantities which are compatible, i.e. simultaneously defined.

  • 0
    Wow, thank you so very much for this!! I am going to be committed in understanding this. I am going to have to reread it at least four more times to thoroughly understand it. This is a great help. Thank you again!!2012-11-14
  • 1
    @diimension You're very welcome. If anything is unclear, feel free to ask.2012-11-14
  • 5
    Really excellent answer, that has gotten way to few upvotes. This is how math should be explained, few formulas and great descriptions. Would give +2 for quantum physics ;)2013-05-02
  • 0
    @Dario Thank you for the comment. I'm just glad that someone found it likable :)2013-05-02
  • 0
    When you say `But this means that A must take each vector of E(lambda) to a scalar multiple of itself.` but I do not see anywhere where you showed that v is an eigenvector of A. You showed that (A*v) is an eigenvector of B. Should one not now show that v is an eigenvector of A? How do you know that v is an eigenvector of A without showing that A*v=eigenvalue_of_A*v ? This is the part I am missing from your proof2014-10-25
  • 0
    @nasser Indeed I have not. I explicitly stated before that we are considering the case where the eigenspace is one dimensional. My statement follows from that assumption. This is only meant to be a rough sketch and not a rigorous proof.2014-10-25
  • 0
    Ok, I see now. I just noticed a small other thing: `Now, if we have two commuting, diagonalizable matrices` Then the proof follows. But I do not see anywhere where the `diagonalizable` part comes into play? The part `To see this, let v be an eigenvector of B under eigenvalue λ. Then B(Av)=A(Bv)=λ Av`. This I understand, but this could be said for any 2 matrices A,B that commute regardless if they are `diagonalizable` or not, right? It will help if you can make it clear where the `diagonalizable` property was actually used in your proof, as I do not see it yet. thanks.2014-10-25
  • 0
    @Nasser Well, we're looking at the simultaneous diagonalizability of two commuting matrices. If one of the matrices is not diagonalizable, then certainly it cannot be simultaneously diagonazable with another matrix. The assumption that $A$ and $B$ are diagonalizable is within the statement "Therefore, for any eigenbasis of $B$ that we take, the corresponding vectors also form an eigenbasis of $A$." The existence of the eigenbasis is equivalent to $A$ and $B$ being diagonalizable.2014-10-25
  • 0
    idk, it is sort of trivial if all the eigenspaces have dimension 1...2017-10-27