144
$\begingroup$

I can follow the definition of the transpose algebraically, i.e. as a reflection of a matrix across its diagonal, or in terms of dual spaces, but I lack any sort of geometric understanding of the transpose, or even symmetric matrices.

For example, if I have a linear transformation, say on the plane, my intuition is to visualize it as some linear distortion of the plane via scaling and rotation. I do not know how this distortion compares to the distortion that results from applying the transpose, or what one can say if the linear transformation is symmetric. Geometrically, why might we expect orthogonal matrices to be combinations of rotations and reflections?

  • 0
    See [this answer](http://math.stackexchange.com/questions/598258/determinant-of-transpose/636198#636198) for a geometric description of the transpose.2014-01-12

4 Answers 4

79

To answer your second question first: an orthogonal matrix $O$ satisfies $O^TO=I$, so $\det(O^TO)=(\det O)^2=1$, and hence $\det O = \pm 1$. The determinant of a matrix tells you by what factor the (signed) volume of a parallelipiped is multipled when you apply the matrix to its edges; therefore hitting a volume in $\mathbb{R}^n$ with an orthogonal matrix either leaves the volume unchanged (so it is a rotation) or multiplies it by $-1$ (so it is a reflection).

To answer your first question: the action of a matrix $A$ can be neatly expressed via its singular value decomposition, $A=U\Lambda V^T$, where $U$, $V$ are orthogonal matrices and $\Lambda$ is a matrix with non-negative values along the diagonal (nb. this makes sense even if $A$ is not square!) The values on the diagonal of $\Lambda$ are called the singular values of $A$, and if $A$ is square and symmetric they will be the absolute values of the eigenvalues.

The way to think about this is that the action of $A$ is first to rotate/reflect to a new basis, then scale along the directions of your new (intermediate) basis, before a final rotation/reflection.

With this in mind, notice that $A^T=V\Lambda^T U^T$, so the action of $A^T$ is to perform the inverse of the final rotation, then scale the new shape along the canonical unit directions, and then apply the inverse of the original rotation.

Furthermore, when $A$ is symmetric, $A=A^T\implies V\Lambda^T U^T = U\Lambda V^T \implies U = V $, therefore the action of a symmetric matrix can be regarded as a rotation to a new basis, then scaling in this new basis, and finally rotating back to the first basis.

  • 0
    Brilliant response!2018-08-24
20

yoyo has succinctly described my intuition for orthogonal transformations in the comments: from polarization you know that you can recover the inner product from the norm and vice versa, so knowing that a linear transformation preserves the inner product ($\langle x, y \rangle = \langle Ax, Ay \rangle$) is equivalent to knowing that it preserves the norm, hence the orthogonal transformations are precisely the linear isometries.

I'm a little puzzled by your comment about rotations and reflections because for me a rotation is, by definition, an orthogonal transformation of determinant $1$. (I say this not because I like to dogmatically stick to definitions over intuition but because this definition is elegant, succinct, and agrees with my intuition.) So what intuitive definition of a rotation are you working with here?

As for the transpose and symmetric matrices in general, my intuition here is not geometric. First, here is a comment which may or may not help you. If $A$ is, say, a stochastic matrix describing the transitions in some Markov chain, then $A^T$ is the matrix describing what happens if you run all of those transitions backwards. Note that this is not at all the same thing as inverting the matrix in general.

A slightly less naive comment is that the transpose is a special case of a structure called a dagger category, which is a category in which every morphism $f : A \to B$ has a dagger $f^{\dagger} : B \to A$ (here the adjoint). The example we're dealing with here is implicitly the dagger category of Hilbert spaces, which is relevant to quantum mechanics, but there's another dagger category relevant to a different part of physics: the $3$-cobordism category describes how space can change with time in relativity, and here the dagger corresponds to just flipping a cobordism upside-down. (Note the similarity to the Markov chain example.) Since relativity and quantum mechanics are both supposed to describe the time evolution of physical systems, it's natural to ask for ways to relate the two dagger categories I just described, and this is (roughly) part of topological quantum field theory.

The punchline is that for me, "adjoint" is intuitively "time reversal." (Unfortunately, what this has to do with self-adjoint operators as observables in quantum mechanics I'm not sure.)

  • 4
    @Elliott: the Markov chain example can be understood as follows. Let's say we have a collection of boxes joined by tubes. In these boxes there are particles running around, and let's say each tube has a fixed direction in which it passes particles. The directions of the tubes determine a transition matrix (in a loose sense) describing the possible directions the particles can travel, and taking the transpose of this matrix corresponds to reversing the allowed direction of each tube. What does this have to do with time reversal? Well, if you ran a movie of the particles traveling in reverse...2011-06-01