My linear algebra class isn't particularly rigorous and my professor doesn't really provide much intuition for most of the theorems we learn, either. Because of this, I've made an effort to make sense of the theorems beyond "the math works out to this result."
I'm finding difficulty finding intuition for the change of basis matrix. In class, we learned that for some subspace $V$, if we take a non-orthonormal basis, $\mathfrak{B} = \left( \vec{v_1}, \ldots, \vec{v_n} \right)$, and find an orthonormal basis using the Gram-Schmidt process, $\mathfrak{U} = \left( \vec{u_1}, \ldots, \vec{u_n} \right)$, we can write
$$ \begin{pmatrix} | & | & & | \\ \vec{v_1} & \vec{v_2} & \cdots & \vec{v_n} \\ | & | & & | \end{pmatrix} = \begin{pmatrix} | & | & & | \\ \vec{u_1} & \vec{u_2} & \cdots & \vec{u_n} \\ | & | & & | \end{pmatrix} R, $$
where $R$ is the change of basis matrix (I think) whose entries are related to the decomposition of each vector in $\mathfrak{B}$. For simplicity, I'll call the first matrix $B$ and the second, $U$. From there, it follows that for a vector $\vec{x}$,
$$ \begin{align} \vec{x} = B\left[ \vec{x} \right]_\mathfrak{B} &= U\left[ \vec{x} \right]_\mathfrak{U} \\ UR\left[ \vec{x} \right]_\mathfrak{B} &= U\left[ \vec{x} \right]_\mathfrak{U} \\ \Rightarrow R\left[ \vec{x} \right]_\mathfrak{B} &= \left[ \vec{x} \right]_\mathfrak{U} \end{align} $$
for that same $R$.
So I think I've found intuition for this matrix $R$ in the context of the first equation: $R$ is sort of like an "un-decomposition" matrix because the columns tell you how to retrieve the vectors from the original basis using the orthonormal basis you created.
However, from a linear transformation standpoint, it doesn't make much sense to me. In my head, I would take my matrix $U$ and transform that into $B$ using $R$, so I would expect the first equation to look like
$$B = R(U)$$
but interestingly enough, it appears that $R$ is being transformed by $U$. So my first question is how do I make sense of this equation from a transformation perspective?
Another thing I'm having trouble understanding is the relationship between the different coordinates of $\vec{x}$, which is the following equation:
$$R\left[ \vec{x} \right]_\mathfrak{B} = \left[ \vec{x} \right]_\mathfrak{U}$$
$R$ is this "un-decomposition" matrix, but it also somehow magically transforms the coordinates of $\vec{x}$ with respect to $\mathfrak{B}$ to coordinates with respect to $\mathfrak{U}$. So, my next question is how come the same matrix $R$ serves both purposes?
My main question overall is how can I interpret this matrix $R$ from a geometric/linear transformation perspective?
Thank you in advance.
EDIT:
After exploring elementary matrices (which my class skipped), I found something very interesting. If we have an elementary matrix $L$ and a matrix of interest $A$, then the rows of $LA$ are given by linear combinations of the rows of $A$, with the coefficients given by the corresponding row of $L$. For example, if we had the product
$$ \begin{pmatrix} 1 & -2 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 0 \end{pmatrix} \begin{pmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{pmatrix} $$
the first row of the product would be given by
$$ 1 \cdot \begin{pmatrix} 1 & 2 & 3 \end{pmatrix} - 2 \cdot \begin{pmatrix} 4 & 5 & 6 \end{pmatrix} + 0 \cdot \begin{pmatrix} 7 & 8 & 9 \end{pmatrix} = \begin{pmatrix} -7 & -8 & -9 \end{pmatrix} $$
and so on. This is quite similar to the situation with the first equation, $B = UR$. However, in that equation, what's happening is that we have linear combinations of the columns, as opposed to the rows as with left multiplication of elementary matrices. So, my idea was that if we put the elementary matrix to the right of our matrix of interest, then we'll have a linear combination of the columns instead. It turns out that my intuition was true. Using the same example as before, we expect the first column to be
$$ 1 \cdot \begin{pmatrix} 1 \\ 0 \\ 0 \end{pmatrix} + 4 \cdot \begin{pmatrix} -2 \\ 1 \\ 0 \end{pmatrix} + 7 \cdot \begin{pmatrix} 0 \\ 0 \\ 1 \end{pmatrix} = \begin{pmatrix} -7 \\ 4 \\ 7 \end{pmatrix} $$
and it is indeed correct!
This idea provides intuition for the fact that $(AB)^T = B^TA^T$: $A$ tells us how to find the rows of $AB$ using linear combinations of the rows of $B$, so if we turn each row into columns, we expect to change the order of multiplication since left multiplication tells us how to find rows and right multiplication tells us how to find columns.
Interpreting $R$ as an elementary matrix explains the order of multiplication in $B = UR$: $R$ is multiplied on the right and tells us how to find the columns vectors of $B$ as a linear combination of the column vectors of $U$.
Lastly, it gives an idea as to why $R$ is multiplied on the left of our coordinate matrices: each row of the coordinate matrix corresponds to a vector in the associated basis. In this particular scenario, each row in $\left[ \vec{x} \right]_\mathfrak{B}$ corresponds to a column vector in $B$, which corresponds to a linear combination of column vectors in $U$. So, putting the rows in the correct linear combination in $\left[ \vec{x} \right]_\mathfrak{B}$ is analogous to putting columns in some linear combination together, which is why a single $R$ serves these two purposes.