How can we prove that all change-of-basis matrices are invertible? The trivial case when it's a change of basis for $\mathbb{R^{n}}$ is easily demonstratable using, for example, determinants. But I am struggling to rigorously show this for all bases, for example for a two-dimensional subspace of $\mathbb{R^{4}}$. I am sure that there are many ways to go about this proof, and I would be very appreciative for as many ways of demonstration as possible, to back up my intuition!
Rigorously proving that a change-of-basis matrix is always invertible
-
6For something like this you probably should include the definitions that you use. This is because there are different pedagogical presentations of linear algebra that treats different things as definitions and different things as consequences of those definitions. In one of the presentations I am familiar with, a change-of-basis matrix is pretty much by definition invertible (being a square matrix of full rank). – 2011-03-08
-
0@Willie: in my class we acknowledge the matrix to be square, but I haven't heard the mention of full rank. I guess it would be a consequence of it being a change of basis matrix, but that's not something we spoke of. – 2011-03-08
-
0Any vector space of finite dimension $n$ is isomorphic to $\mathbb{R}^n$, so if you can prove it for $\mathbb{R}^n$ you've proved it for all finite-dimensional vector spaces. In particular, whatever you can do with determinants to show it for $\mathbb{R}^n$, you can do the same thing with respect to arbitrary bases of arbitrary finite-dimensional vector spaces. – 2011-03-08
-
2I agree with Willie. Off the top of my head, here is one way to look at it (which may or may not agree with your setup): view the change of basis matrix as a linear transformation $L$ which carries one basis $v_1,...,v_n$ to a different basis $w_1,...,w_n$. Then there is a unique linear transformation carrying the basis $w_1,...,w_n$ to $v_1,...,v_n$ and the matrix of this transformation must be the inverse to the first change of basis matrix. – 2011-03-08
-
0Maybe this specific example will clarify things a bit: if I am given $A=BC$ where $A$ and $B$ are 4x2 matrices whose columns form bases for a 2-dimensional subspace of $\mathbb{R^{4}}$, and $C$ is a 2x2 change-of-basis matrix, how do we show that $C$ is invertible? Using matrix operations, algebra, or anything else. – 2011-03-08
-
0@Karamislo: please give your definition of "a change-of-basis matrix". Also see http://math.stackexchange.com/questions/21557/prove-that-if-s-is-a-change-of-basis-matrix-its-columns-are-a-basis-for-math – 2011-03-08
-
1What is a change-of-basis matrix except a matrix which is invertible? – 2011-03-08
-
0@wildildildlife: ok, let's say that in the example I cited $C$ is a 2x2 matrix, which when multiplied by a matrix whose columns form a basis produces another matrix whose columns form a basis for the same subspace. How do we show that such a mtrix $C$ is invertible (not even calling it change-of-basis)? The question in the link you provided deals with bases for $\mathbb{R^{n}}$, for which case as I said the proof is trivial (also that question is concerned more with proving that columns of the c-o-b matrix for a basis for a space, which is not true in my question). – 2011-03-08
-
0@Karamislo: Why is the proof trivial for $\mathbb{R}^n$ and not for other finite-dimensional vector spaces? Your answer to that question might also go some way towards answering some of the other questions that've been asked about what you're assuming and what definitions you're using. – 2011-03-08
-
0@joriki: for bases of the whole $\mathbb{R^{n}}$ I can write $A=BC$ where $A$, $B$, and $C$ are nxn matrices, and $A$ and $B$ have basis vectors of $\mathbb{R^{n}}$ as their columns. Then I can say the following: det $A$ and det $B$ do not equal to zero, as the matrices are invertible since their coumns are independent. Then det $C$ can't equal zero either, otherwise the equality det$A$=det$B$det$C$ would not hold. So, $C$ is invertible. But I can't figure out how to do something of this sort for a 4x2 matrix, for example. – 2011-03-08
-
1@Karamislo: A change-of-basis matrix is never a $4\times2$ matrix. Irrespective of whether the vector space is $\mathbb{R}^2$ or a two-dimensional subspace of $\mathbb{R}^4$, as long as it's two-dimensional, all its bases have 2 elements, and any matrix representing a change from one basis to another is a $2\times2$ matrix, to which you can apply exactly the same reasoning as if it represented a change of basis in $\mathbb{R}^2$. – 2011-03-08
-
0@joriki: I didn't say it was... I was referring to $A$ and $B$ being 4x2 matrices as in the example I presented before, in which case $A$=$B$$C$ where C is 2x2. I can't invoke the deerminant equality here since $A$ and $B$ are not square matrices. – 2011-03-08
-
0@Karamislo: this conversation is going in circles. _What is your definition of a change-of-basis matrix_? – 2011-03-08
-
0@Qiaouchu: My definition of change of basis matrix is such a matrix $C$ which when multiplied by a matrix $B$ whose columns form a basis of a certain subspace, produces another matrix $A$ whose columns form a basis for the same subspace. As I showed, I can algebraically prove that this $C$ is invertible when $A$ and $B$ are square, but I can't find a way to do that when $A$ and $B$ are not square, as in the example with 4x2 matrices. – 2011-03-08
-
0Can't you just show that since the change-of-basis matrix is linearly independent, it is invertible by the fundamental theorem of invertible matrices? – 2013-11-12
1 Answers
What is a change-of-basis matrix? You have a vector space $\mathbf{V}$ (and it doesn't matter if $\mathbf{V}$ is all of $\mathbb{R}^n$, or some subspace thereof, or even something entirely different), and two different ordered bases for $\mathbf{V}$, $\beta_1$ and $\beta_2$ (necessarily of the same size, since two bases of the same vector space always have the same size): \begin{align*} \beta_1 &= \Bigl[ \mathbf{v}_1,\mathbf{v}_2,\ldots,\mathbf{v}_n\Bigr]\\ \beta_2 &= \Bigl[ \mathbf{w}_1,\mathbf{w}_2,\ldots,\mathbf{w}_n\Bigr]. \end{align*} A "change of basis" matrix is a matrix that translates from $\beta_1$ coordinates to $\beta_2$ coordinates. That is, $A$ is a change-of-basis matrix (from $\beta_1$ to $\beta_2$) if, given the coordinate vector $[\mathbf{x}]_{\beta_1}$ of a vector $\mathbf{x}$ relative to $\beta_1$, then $A[\mathbf{x}]_{\beta_1}=[\mathbf{x}]_{\beta_2}$ gives the coordinate vector of $\mathbf{x}$ relative to $\beta_2$, for all $\mathbf{x}$ in $\mathbf{V}$.
How do we get a change-of-basis matrix? We write each vector of $\beta_1$ in terms of $\beta_2$, and these are the columns of $A$: \begin{align*} \mathbf{v}_1 &= a_{11}\mathbf{w}_1 + a_{21}\mathbf{w}_2+\cdots+a_{n1}\mathbf{w}_n\\ \mathbf{v}_2 &= a_{12}\mathbf{w}_1 + a_{22}\mathbf{w}_2 +\cdots + a_{n2}\mathbf{w}_n\\ &\vdots\\ \mathbf{v}_n &= a_{1n}\mathbf{w}_1 + a_{2n}\mathbf{w}_2 + \cdots + a_{nn}\mathbf{w}_n. \end{align*} We know we can do this because $\beta_2$ is a basis, so we can express any vector (in particular, the vectors in $\beta_1$) as linear combinations of the vectors in $\beta_2$.
Then the change-of-basis matrix translating from $\beta_1$ to $\beta_2$ is $$ A = \left(\begin{array}{cccc} a_{11} & a_{12} & \cdots & a_{1n}\\ a_{21} & a_{22} & \cdots & a_{2n}\\ \vdots & \vdots & \ddots & \vdots \\ a_{n1} & a_{n2} & \cdots & a_{nn} \end{array}\right).$$
Why is $A$ always invertible? Because just like there is a change-of-basis from $\beta_1$ to $\beta_2$, there is also a change-of-basis from $\beta_2$ to $\beta_1$. Since $\beta_1$ is a basis, we can express every vector in $\beta_2$ using the vectors in $\beta_1$: \begin{align*} \mathbf{w}_1 &= b_{11}\mathbf{v}_1 + b_{21}\mathbf{v}_2 + \cdots + b_{n1}\mathbf{v}_n\\ \mathbf{w}_2 &= b_{12}\mathbf{v}_2 + b_{22}\mathbf{v}_2 + \cdots + b_{n2}\mathbf{v}_n\\ &\vdots\\ \mathbf{w}_n &= b_{1n}\mathbf{v}_n + b_{2n}\mathbf{v}_n + \cdots + b_{nn}\mathbf{v}_n. \end{align*} So the matrix $B$, with $$B = \left(\begin{array}{cccc} b_{11} & b_{12} & \cdots & b_{1n}\\ b_{21} & b_{22} & \cdots & b_{2n}\\ \vdots & \vdots & \ddots & \vdots\\ b_{n1} & b_{nn} & \cdots & b_{nn} \end{array}\right),$$ has the property that given any vector $\mathbf{x}$, if $[\mathbf{x}]_{\beta_2}$ is the coordinate vector of $\mathbf{x}$ relative to $\beta_2$, then $B[\mathbf{x}]_{\beta_2}=[\mathbf{x}]_{\beta_1}$ is the coordinate vector of $\mathbf{x}$ relative to $\beta_1$.
But now, consider what the matrix $BA$ does to the standard basis of $\mathbb{R}^n$ (or $\mathbf{F}^n$, in the general case): what is $BA\mathbf{e}_i$, where $\mathbf{e}_i$ is the vector that has a $1$ in the $i$th coordinate and zeros elsewhere? It's a matter of interpreting this correctly: $\mathbf{e}_i$ is the coordinate vector relative to $\beta_1$ of $\mathbf{v}_i$, because $[\mathbf{v}_i]={\beta_1}=\mathbf{e}_i$. Therefore, since $A[\mathbf{x}]_{\beta_1} = [\mathbf{x}]_{\beta_2}$ and $B[\mathbf{x}]_{\beta_2}=[\mathbf{x}]_{\beta_1}$ for every $\mathbf{x}$, we have: $$BA\mathbf{e}_i = B(A\mathbf{e}_i) = B(A[\mathbf{v}_i]_{\beta_1}) = B[\mathbf{v}_i]_{\beta_2} = [\mathbf{v}_i]_{\beta_1} = \mathbf{e}_i.$$ That is, $BA$ maps $\mathbf{e}_i$ to $\mathbf{e}_i$ for $i=1,\ldots,n$. The only way for this to happen is if $BA=I_n$ is the identity. The same argument, now interpreting $\mathbf{e}_i$ as $[\mathbf{w}_i]_{\beta_2}$, shows that $AB$ is also the identity.
So $A$ and $B$ are both invertible.
So every change-of-basis matrix is necessarily invertible.
It doesn't really matter if you are considering a subspace of $\mathbb{R}^N$, a vector space of polynomials or functions, or any other vector space. So long as it is finite dimensional (so that you can define the "change-of-basis" matrix), change-of-basis matrices are always invertible.
Added. I just saw the comment where you give the definition you have of change-of-basis matrix: a matrix $C$ which, when multiplied by a matrix $B$ whose columns form a basis of a certain subspace, produces another matrix $A$ whose columns form a basis for the same subspace.
This matrix $C$ is just the matrix that expresses the columns of $A$ in terms of the columns of $B$. That is, it's the change-of-basis matrix from "columns-of-A" coordinates to "columns-of-B" coordinates.
For example, take the subspace of $\mathbb{R}^4$ given by $x=z$ and $y=w$, with basis $$\left(\begin{array}{c}1\\0\\1\\0\end{array}\right),\quad\left(\begin{array}{c}0\\1\\0\\1\end{array}\right),$$ and now consider the same space, but with basis $$\left(\begin{array}{c}1\\1\\1\\1\end{array}\right),\quad \left(\begin{array}{r}1\\-2\\1\\-2\end{array}\right).$$ The matrix $C$ such that $$ \left(\begin{array}{rr} 1 & 1\\ 1 & -2\\ 1 & 1\\ 1 & -2 \end{array}\right) = C\left(\begin{array}{cc} 1 & 0\\ 0 & 1\\ 1 & 0\\ 0 & 1 \end{array}\right)$$ is obtained by writing each vector in the columns of $A$ in terms of the columns of $B$: \begin{align*} \left(\begin{array}{r} 1\\1\\1\\1\end{array}\right) &= 1\left(\begin{array}{c}1\\ 0\\ 1\\ 0\end{array}\right) + 1\left(\begin{array}{c}0 \\ 1 \\ 0 \\ 1\end{array}\right),\\ \left(\begin{array}{r} 1\\ -2\\ 1\\ -2\end{array}\right) &= 1\left(\begin{array}{c}1\\0\\1\\0\end{array}\right) -2\left(\begin{array}{c}0\\1\\0\\1\end{array}\right). \end{align*} And so, the matrix $C$ is $$C = \left(\begin{array}{rr} 1 & 1\\ 1 & -2 \end{array}\right).$$ Expressing the columns of $B$ in terms of the columns of $A$ give the inverse: \begin{align*} \left(\begin{array}{c}1\\ 0\\ 1\\ 0\end{array}\right) &= \frac{2}{3}\left(\begin{array}{c}1 \\ 1\\ 1\\ 1\end{array}\right) + \frac{1}{3}\left(\begin{array}{r}1 \\ -2\\ 1\\ -2\end{array}\right)\\ \left(\begin{array}{c}0\\ 1\\ 0\\ 1\end{array}\right) &= \frac{1}{3}\left(\begin{array}{c} 1\\ 1\\ 1\\ 1\end{array}\right) -\frac{1}{3}\left(\begin{array}{r}1\\ -2\\ 1\\ -2\end{array}\right), \end{align*} so the inverse of $C$ is: $$C^{-1} = \left(\begin{array}{rr} \frac{2}{3} & \frac{1}{3}\\ \frac{1}{3} & -\frac{1}{3} \end{array}\right),$$ which you can verify by multiplying by $C$.
-
0@Arturo Magidin: Nice explanation -- but I don't understand why you brought the standard basis of $\mathbb{R}^n$ into it -- why not just say that since $A$ translates $\beta_1$ coordinates into $\beta_2$ coordinates and $B$ translates $\beta_2$ coordinates into $\beta_1$ coordinates, then applying first $A$ and then $B$ translates $\beta_1$ coordinates into $\beta_2$ coordinates and back to $\beta_1$ coordinates, and thus $BA$ must be the identity matrix (and likewise for $AB$)? – 2011-03-08
-
2@joriki: Actually, now I remember why: Since you are dealing with the matrices $A$ and $B$, you would have to think about the the transformation that maps from $\mathbf{V}$ to $\mathbb{R}^n$ via "coordinate vector", then map to $\mathbb{R}^n$ via $BA$, then map back to $\mathbf{V}$ via "what the coordinate vector means", so I thought it would be a bit more cumbersome than just directly seeing what the matrices $AB$ and $BA$ do to a basis for $\mathbb{R}^n$. – 2011-03-08
-
0@Arturo Magidin: I don't understand. The thought of 'a transformation that maps from $\mathbf{V}$ to $\mathbb{R}^n$ via "coordinate vector"' never crosses my mind when I think about this -- all you need is the fact that the coordinates of a vector in a basis are unique -- then if $BA$ translates from one set of coordinates to another and back, it has to be the identity. To my mind, mentioning "the standard basis of $\mathbb{R}^n$" just confuses things, since it sounds as if $\mathbb{R}^n$ is entering as a vector space, whereas actually it's only entering as the set of n-tuples of coordinates. – 2011-03-09
-
0@joriki: I disagree that $\mathbb{R}^n$ only enters as "set of n-tuples of coordinates". In fact, the act of writing the coordinate vector *is* a linear transformation from $\mathbf{V}$ to $\mathbb{R}^n$ (in fact, it's how we *prove* that an $n$-dimensional vector space is isomorphic to $\mathbf{F}^n$). If you're not used to these things, you do think about those "translations", and I suspect the OP is not used to them yet. But in any case, it's a disagreement between you and me on how to explain things, not on substance. – 2011-03-09
-
0@Arturo Magidin: I disagree that it's a disagreement :-) I began both my comments with "I don't understand". I've noticed that you're good at explaining things to people who are "not used to these things", and I'm always trying to get better at that; it's quite likely that you're right and I just don't see it yet. I agree that we view a coordinate tuple as a vector when we prove isomorphism, but you're not proving isomorphism here, and I still don't see how a basis of $\mathbb{R}^n$ plays any role here and why it helps to consider the coordinate tuple as anything other than just a tuple. – 2011-03-09
-
0@Arturo Magidin: (Of course we view the tuple as a row "vector" when we multiply it by matrices, but that doesn't rely on bases of $\mathbb{R}^n$ or viewing $\mathbb{R}^n$ as a vector space.) – 2011-03-09
-
0@joriki: The idea for my argument was: "How do we show that $AB$ is the identity? We show that it acts like the identity on a basis" (that's pretty much the same as what you're saying, I believe). What basis? $AB$ is a matrix; we can either "translate" this matrix into a linear transformation and see how it acts on $\beta_1$ (what you are proposing to do); or else we can let $AB$ act on the space it naturally acts on, $\mathbb{R}^n$, and interpret those vectors in some way that make the action of $AB$ on them clear. You are proposing the former, I did the latter. – 2011-03-09
-
0@Arturo Magidin: I think there's a misunderstanding there. I'm not proposing to see how $AB$ acts on $\beta_1$. My point is that there's no need to mention any basis at this point, neither one of $\mathbb{R}^n$, nor one of $\mathbf{V}$. $AB$ is the identity simply because multiplication by $B$ represents a translation from $\beta_2$ coordinates to $\beta_1$ coordinates and multiplication by $A$ represents a translation from $\beta_1$ coordinates to $\beta_2$ coordinates, so applying them one after the other translates from $\beta_2$ coordinates to $\beta_2$ coordinates, which is the identity. – 2011-03-09
-
0@joriki: Okay; rather than look at specific vectors like I did, you propose looking at a generic vector and simply interpret it as a "coordinate vector". Sure, that definitely works. I happen to be teaching advanced linear algebra at the moment, and "look at what happens to a basis" is one of our standard tricks (e.g., to show to linear transformations are equal), so I may just have gone down that road by inertia rather than pedagogical concerns. – 2011-03-09