15
$\begingroup$

"Let $P$ be the change-of-basis matrix from a basis $S$ to a basis S' in a vector space $V$. Then, for any vector $v \in V$, we have P[v]_{S'}=[v]_{S} \text{ and hence, } P^{-1}[v]_{S} = [v]_{S'}

Namely, if we multiply the coordinates of $v$ in the original basis $S$ by $P^{-1}$, we get the coordinates of $v$ in the new basis S'." - Schaum's Outlines: Linear Algebra. 4th Ed.

I am having a lot of difficulty keeping these matrices straight. Could someone please help me understand the reasoning behind (what appears to me as) the counter-intuitive naming of $P$ as the change of basis matrix from $S$ to S'? It seems like $P^{-1}$ is the matrix which actually changes a coordinate vector in terms of the 'old' basis $S$ to a coordinate vector in terms of the 'new' basis S'...

Added:

"Consider a basis $S = \{u_1,u_2,...,u_n\}$ of a vector space $V$ over a field $K$. For any vector $v\in V$, suppose $v = a_1u_1 +a_2u_2+...+a_nu_n$

Then the coordinate vector of $v$ relative to the basis $S$, which we assume to be a column vector (unless otherwise stated or implied), is denoted and defined by $[v]_S = [a_1,a_2,...,a_n]^{T}$. "

"Let $S = \{ u_1,u_2,...,u_n\}$ be a basis of a vector space $V$, and let S'=\{v_1,v_2,...,v_n\} be another basis. (For reference, we will call $S$ the 'old' basis and S' the 'new' basis.) Because $S$ is a basis, each vector in the 'new' basis S' can be written uniquely as a linear combination of the vectors in S; say,

$\begin{array}{c} v_1 = a_{11}u_1 + a_{12}u_2 + \cdots +a_{1n}u_n \\ v_2 = a_{21}u_1 + a_{22}u_2 + \cdots +a_{2n}u_n \\ \cdots \cdots \cdots \\ v_n = a_{n1}u_1 + a_{n2}u_2 + \cdots +a_{nn}u_n \end{array}$

Let $P$ be the transpose of the above matrix of coefficients; that is, let $P = [p_{ij}]$, where $p_{ij} = a_{ij}$. Then $P$ is called the \textit{change-of-basis matrix} from the 'old' basis $S$ to the 'new' basis S'." - Schaum's Outline: Linear Algebra 4th Ed.

I am trying to understand the above definitions with this example:

Basis vectors of $\mathbb{R}^{2}: S= \{u_1,u_2\}=\{(1,-2),(3,-4)\}$ and S' = \{v_1,v_2\}= \{(1,3), (3,8)\} the change of basis matrix from $S$ to S' is $P = \left( \begin{array}{cc} -\frac{13}{2} & -18 \\ \frac{5}{2} & 7 \end{array} \right)$.

My current understanding is the following: normally vectors such as $u_1, u_2$ are written under the assumption of the usual basis that is $u_1 = (1,-2) = e_1 - 2e_2 = [u_1]_E$. So actually $[u_1]_S = (1,0)$ and I guess this would be true in general... But I am not really understanding what effect if any $P$ is supposed to have on the basis vectors themselves (I think I understand the effect on the coordinates relative to a basis). I guess I could calculate a matrix P' which has the effect P'u_1, P'u_2,...,P'u_n = v_1, v_2,..., v_n but would this be anything?

  • 1
    Hmph. Of course, this is merely nomenclature, but for example, *Linear Algebra, 4th Ed.* by Friedberg, Insel, and Spence would call $P$ the "change-of-coordinates/basis matrix from $S'$ to $S$" (as you suggest), or more explicitly, say that $P$ "changes $S'$-coordinates to $S$-coordinates". I use the nomenclature you suggest.2011-02-13

4 Answers 4

12

The situation here is closely related to the following situation: say you have some real function $f(x)$ and you want to shift its graph to the right by a positive constant $a$. Then the correct thing to do to the function is to shift $x$ over to the left; that is, the new function is $f(x - a)$. In essence you have shifted the graph to the right by shifting the coordinate axes to the left.

In this situation, if you have a vector $v$ expressed in some basis $e_1, ... e_n$, and you want to express it in a new basis $Pe_1, .... Pe_n$ (this is why $P$ is called the change of basis matrix), then you multiply the numerical vector $v$ by $P^{-1}$ in order to do this. You should carefully work through some numerical examples to convince yourself that this is correct. Consider, for example, the simple case that $P$ is multiplication by a scalar.

The lesson here is that one must carefully distinguish between vectors and the components used to express a vector in a particular basis. Vectors transform covariantly, but their components transform contravariantly.

  • 0
    @ghshtalt: yes.2011-02-13
5

Everybody studying the change of basis affair should work out some simple examples like the following. Consider this basis in $\mathbb{R}^2$:

$ v_1 = (1,1) \qquad \text{and} \qquad v_2 = (1,-1) \ . $

Or, since we are going to stress the bases and coordinates thing, we could write it this way

$ v_1 = (1,1)_e \qquad \text{and} \qquad v_2 = (1,-1)_e \ , $

since these are coordinates in the standard basis

$ e_1 = (1,0) \qquad \text{and} \qquad e_2 = (0,1) \ . $

The change of basis matrix from $v$ to $e$ is

$ P = \begin{pmatrix} 1 & 1 \\\ 1 & -1 \end{pmatrix} \ . $

Now, take the vector

$ u = 2v_1 - 3v_2 \ . $

Its coordinates in the $v$ basis are:

$ u = (2,-3)_v \ . $

If you want to obtain its coordinates in the $e$ (standard) basis, you can do it by hand:

$ u = 2v_1 - 3v_2 = 2(1,1)_e -3(1,-1)_e = (2-3, 2+3)_e = (-1, 5)_e \ . $

Now, you realise that these are exactly the same operations that you do when performing this matrix multiplication:

$ P \begin{pmatrix} 2 \\\ -3 \end{pmatrix} = \begin{pmatrix} 1 & 1 \\\ 1 & -1 \end{pmatrix} \begin{pmatrix} 2 \\\ -3 \end{pmatrix} = \begin{pmatrix} 2 - 3 \\\ 2 + 3 \end{pmatrix} = \begin{pmatrix} -1 \\\ 5 \end{pmatrix} \ . $

Exercise. Maybe now you could redo yourself the proof of the change of basis theorem: take two arbitrary bases $v$ and $e$ in no matter which vector space, related by

$ v_i = a^1_i e_1 + \cdots + a^n_i e_n \ , \qquad i = 1, \dots , n \ . $

Write down the change of basis matrix from $v$ to $e$ (that is, put the coordinates of the $v$ vectors as columns, like in the previous example):

$ P = \begin{pmatrix} a^1_1 & \dots & a^1_n \\\ \vdots & \ddots & \vdots \\\ a^n_1 & \dots & a^n_n \end{pmatrix} \ , $

take any vector

$ u = b^1v_1 + \cdots + b^nv_n \ , $

and write down its coordinates in the $v$ basis. Finally, find out its coordinates in the $e$ basis (by hand and with the help of the matrix $P$).

2

One major reason is practical. The matrix that converts vectors in the new coordinates into the old coordinates is easy to come by: you just put your new basis vectors as columns of the matrix.

Then to find the matrix going the other way around, you have to compute the inverse of this matrix.

Thus, it makes sense to call the first one $P$, and the second one $P^{-1}$.