1
$\begingroup$

Ok so here's the problem.

$T$ is a nilpotent linear transformation on a finite dimensional vector space. (Let's say $V=\mathbb{R}^{n}$, without loss of generality.)

Fact: $T$ has only $0$ as an eigenvalue and there is a smallest nonzero natural number, $m$, such that $\text{Ker}(T^{m})=V$.

Show that $T$ can be written as an upper triangular matrix, $A$, with $0$'s on the diagonal where $A$ is with respect to the basis derived as follows:

First find a basis for $\text{Ker}(T)$. Then expand that basis to one for $\text{Ker}(T^{2})$. Expand again for a basis of $\text{Ker}(T^{3})$ and so on until you get a basis for $Ker(T^m) = V\ $ [of course m could be smaller than 3].

Just a reminder: the columns of a matrix are the image of the basis vectors. Therefore if our final basis after the process just explained is {$u_1, u_2, ..., u_n$}, then the matrix will be $[T(u_1), T(u_2), ..., T(u_n)]$.

Please let me know if you can think of anything. This should be pretty simple, but I'm not seeing something.

2 Answers 2

2

You were already given some pretty good hints in the question. However, it probably shouldn't have said "find" a basis, because you don't actually need to find it. ${\rm Ker}(T)$ has a basis, let it be $u_1, u_2, \ldots u_k$. Then ...

  • 0
    It's my own personal question. What I posted is definitely the "solution". The problem is that I don't see how the matrix of that form is forced from the given basis. As stated; it should be simple, but I don't see it.2011-05-24
2

This is a response to user11314's comment to Robert Israel's answer.

Let $n_1 = \dim(\mathrm{ker}(T))$, let $n_1+n_2=\dim(\mathrm{ker}(T^2))$, and so on, until $\dim(\mathrm{ker}(T^m)) = n_1+n_2+\cdots+n_m = \dim(V)$.

Let $u_1,\ldots,u_{n_1}$ be the basis you find for $\mathrm{ker}(T)$. Then $u_{n_1+1},\ldots,u_{n_1+n_2}$ the extension for a basis of $\mathrm{ker}(T^2)$.

The key observation is that if $\mathbf{v}\in\mathrm{ker}(T^k)$, then $T(\mathbf{v})\in\mathrm{ker}(T^{k-1})$.

So, think about the matrix you get from this basis. The first $n_1$ columns are $0$, since $u_1,\ldots,u_{n_1}$ are all in the kernel.

The next $n_2$ columns correspond to the images of $u_{n_1+1},\ldots,u_{n_1+n_2}$; each of these vectors lie in $\mathrm{ker}(T^2)$ but not in $\mathrm{ker}(T)$, so their images lie in $\mathrm{ker}(T)$ and therefore are expressed using only the vectors $u_1,\ldots,u_{n_1}$; so these columns have $0$s below the $n_1$st row, i.e., are zeros at and below the diagonal at least.

The next $n_3$ columns correspond to images of $u_{n_1+n_2+1}$ through $u_{n_1+n_2+n_3}$. Since these images lie in $\mathrm{ker}(T^2)$, they are expressed using $u_1,\ldots,u_{n_1+n_2}$, so the nonzero entries in these columns lie above the main diagonal.

And so on.

  • 0
    Thanks Arturo. I definitely understand what you are saying. Here's my problem with it though. You say that the $n_2$ columns can be written as a linear combination of $u_1, \dots, u_{n_1}$. We don't know what $u_1, \dots, u_{n_1}$ look like; we only know what the image of these vectors look like (0).2011-05-24
  • 0
    @user11314: You know they are a basis for $\mathrm{ker}(T)$, and you know that $T(u_{n_1}+1)$ lies in $\mathrm{ker}(T)$, so you know that it can be expressed as a linear combination of $u_1,\ldots,u_{n_1}$. That's all you need to show that the matrix is upper triangular: that for each $i$, $T(u_i)$ can be expressed as a linear combination of the vectors $u_1,\ldots u_{i-1}$ in the basis. You don't need to know what they *are*, because the matrix of $T$ relative to this basis doesn't care what the vectors "are", it just "cares" how you express $T(u)$ in terms of the basis.2011-05-24
  • 0
    I understand the dependencies you are stating, but I feel like the relevant one is still missing. Here's a hypothetical example hopefully clearing up my misunderstanding. Suppose that each time the basis is expanded, we get one new vector so that there is something in $Ker(T^n)$ which is not in $Ker(T^{n-1})$ and $Ker(T^n) = V$. $u_1, \dots, u_n$ are the basis vectors from the expansion process. $T(u_1) = 0$ since $u_1 \in Ker(T)$. This explains the first column of the matrix. Now supposedly $T(u_2)$ is a vector with only one nonzero component(the first component) and $T(u_2) = c u_1$.2011-05-24
  • 0
    I see the dependency, but there's no explanation of why $T(u_2)$ has only one nonzero component. We don't know that $u_1$ has only one nonzero component. All we know about $u_1$ is that $T(u_1) = 0$. Similarly, $T(u_3)$ is supposedly a vector with only two nonzero components(the first and second components) and $T(u_3) = c_1 u_1 + c_2 u_2$. Still, we don't know that $u_1$ has at most one nonzero entry(in the first component) and $u_2$ has at most two nonzero entries. I don't see how we can make any conclusions about the form of the matrix except for the first column.2011-05-24
  • 0
    @user11314: I don't understand what your problem is. You want to show that the matrix is upper triangular. That means showing that the only nonzero entries in the $k$th column occur above the diagonal; i.e., that $T(u_k)$, when expressed as a linear combination of the basis, can be expressed using *only* the vectors $u_1,\ldots,u_{k-1}$. If $T(u_2)=cu_1$, then the 2nd column of the matrix is $(c,0,0,\ldots,0)^T$, so all the entries at and below the diagonal are equal to $0$. What *is* the problem? (cont)2011-05-24
  • 0
    @user11313: Note: we are not asking what $T(u_2)$ *is*. **We don't care what vector of $\mathbb{R}^n$ it is.** What we care about is how we express it using the basis. We are not looking at the *standard* matrix of $T$, we are looking at the **coordinate matrix** of $T$ with respect to the basis $[u_1,\ldots,u_n]$. We *don't know* that $u_1$ has at most one nonzero entry; we don't care! Because the $k$th column is not the vector $T(u_k)$, the $k$th column is the **coordinate vector** of $T(u_k)$ with respect to the basis $[u_1,\ldots,u_n]$.2011-05-24
  • 0
    @user11313: In fact, what you write in the original post is not entirely correct. The correct statement is that the matrix is $$\Bigl[ [T(u_1)]_{\beta}\Bigm|[T(u_2)]_{\beta}\Bigm|\cdots\Bigm|[T(u_n)]_{\beta}\Bigr]$$where $[\mathbf{w}]_{\beta}$ is the coordinate vector of $\mathbf{w}$ with respect to the basis $\beta$, and here $\beta=[u_1,\ldots,u_n]$. The matrix $[T(u_1)\cdots T(u_n)]$ is the matrix from $\mathbb{R}^n$ with basis $[u_1,\ldots,u_n]$ to $\mathbb{R}^n$ with the standard matrix, not the matrix of $T$ with respect to $[u_1,\ldots,u_n]$. Is this the source of your confusion?2011-05-24
  • 0
    That clears it up! I was stubbornly working with the standard matrix of T. Thanks much!2011-05-24