I am proving by construction that there is some basis in which a nilpotent endomorphism has a jordan canonical form that has only ones over supradiagonal. I'll put what I have already and stop where my problem is in order you can think it in the way I am.
What I want to prove is:
Theorem
Let $T\in\mathcal{L}(V)$ a $r-$nilpotent endomorphism, $V(\mathbb{C})$ a finite-dimensional vector space. There is some basis of $V$ in which the matrix representation of $T$ is a block diagonal matrix, and the blocks have the form \begin{align*} \left( \begin{array}{cccccc} 0 &1 &0 &0 &\dots &0\\ 0 &0 &1 &0 &\dots &0\\ 0 &0 &0 &1 &\dots &0\\ \vdots &\vdots &\vdots &\vdots &\ddots &\vdots\\ 0 &0 &0 &0 &\dots &1\\ 0 &0 &0 &0 &\dots &0 \end{array} \right) \end{align*} that is, blocks that have null entries except for the ones-filled supradiagonal.
Proof
First we have that if $T$ is a $r-$nilpotent endomorphism then $T^{r}=0_{\mathcal{L}(V)}$, then, since $U_{1}=T(V)\subseteq V=id(V)=T^{0}(V)=U_{0}$ therefore $U_{2}=T^{2}(V)=T(T(V))\subseteq T(V)=U_{1}$ and if we suppose that $U_{k}=T^{k}(V)\subseteq T^{k-1}(V)=U_{k-1}$ we conclude that $U_{k+1}=T^{k+1}(V)=T(T^{k}(V))\subseteq T(T^{k-1}(V))=T^{k}(V)=U_{k}$. Then we have proven by induction over $k$ that $U_{k}=T^{k}(V)\subseteq T^{k-1}(V)=U_{k-1}$, and since $T^{r}=0_{\mathcal{L}(V)}$, and $U_{k}=T(U_{k-1})$ then $\{0_{V}\}=U_{r}\subseteq U_{r-1}\subseteq\dots\subseteq U_{1}\subseteq U_{0}=V$ and we have shown too that the $U_{k}$ are $T-$invariant spaces and $U_{r-1}\subseteq\ker T$.
In the same manner, let $W_{0}=\ker T^{0}=\ker id=\{0_{V}\}$ and $W_{k}=\ker T^{k}$. Is easy to see that $T(W_{0})=T(\{0_{V}\})=\{0_{V}\}$ therefore $W_{0}\subseteq W_{1}$, moreover $T^{2}(W_{1})=T(T(W_{1}))=T(\{0_{V}\})=\{0_{V}\}$ therefore $W_{1}\subseteq W_{2}$. Then, suppose $W_{k-1}\subseteq W_{k}$, and we see that $T^{k+1}(W_{k})=T(T^{k}(W_{k}))=T(\{0_{V}\})=\{0_{V}\}$ and therefore $W_{k}\subseteq W_{k+1}$ and we conclude we have the chain of nested spaces $\{0_{V}\}=W_{0}\subseteq W_{1}\subseteq\dots\subseteq W_{r-1}\subseteq W_{r}=V$ since $W_{r}=\ker T^{r}=\ker 0_{\mathcal{L}(V)}=V$.
Since we have a chain of nested spaces in which the largest is $V$ itself, then if we choose a basis for the smallest non-trivial (Supposing $U_{r}\neq U_{r-1}$) of them (that is $U_{r-1}$) we can climb chain constructing a basis for the larger spaces completing the basis we have already, what is always possible.
Now, since $U_{r-1}\subseteq\ker T$ then every vector in $U_{r-1}$ is a eigenvector for eigenvalue $0$. Then every basis we choose for $U_{r-1}$ is a basis of eigenvectors. To complete this basis $\{u_{i}^{(r-1)}\}$ to a basis of $U_{r-2}$ (Supposing $U_{r-1}\neq U_{r-2}$) we can remember that $T(U_{r-2})=U_{r-1}$, therefore every vector in $U_{r-1}$ has a preimage in $U_{r-2}$. Then there are some $u_{i}^{(r-2)}\in U_{r-2}$ (maybe many for each $i$ since we don't know $T$ is inyective) such that $T(u_{i}^{(r-2)})=u_{i}^{(r-1)}$. It's to be noted that for fixed $i$ is not possible that $u_{i}^{(r-2)}=u_{i}^{(r-1)}$ since $u_{i}^{(r-1)}$ is an eigenvector associated to eigenvalue $0$ and also every vector in $U_{r-1}$ since they are linear combinations of the basis vectors. Since we have stated they are non unique we can choose one and only one for every $i$. It only remains to see they are linearly independent: let take a linear combination of null vector $\alpha_{i}u_{i}^{(r-1)}+\beta_{i}u_{i}^{(r-2)}=0_{V}$ and let apply $T$ on both sides, $\alpha_{i}T(u_{i}^{(r-1)})+\beta_{i}T(u_{i}^{(r-2)})=\sum_{i}\alpha_{i}0_{V}+\beta_{i}u_{i}^{(r-1)}=\beta_{i}u_{i}^{(r-1)}=0_{V}$. Since the last sum is a null linear combination of linearly independent vectors (since they form a basis for $U_{r-1}$), it implies that $\beta_{i}=0$ for every $i$. Therefore the initial expression takes the form $\alpha_{i}u_{i}^{(r-1)}=0_{V}$ and $\alpha_{i}=0$ for every $i$ by the same argument. We conclude that they are linearly independent.
At this moment we have $\{u_{i}^{(r-1)},u_{i}^{(r-2)}\}$ a linearly independent set of vectors in $U_{r-2}$. If $\dim U_{r-2}=2\dim U_{r-1}$, then we have finished the construction, if not ($\dim U_{r-2}\geq 2\dim U_{r-1}+1$) then we have to choose $u_{j}^{(r-2)}$ with $j=\dim U_{r-1}+1,\dots, \dim U_{r-2}$ that complete the set to a basis of $U_{r-2}$. Again, is in construction of the $u_{i}^{(r-2)}$, we remember that $T(U_{r-2})=U_{r-1}$. Therefore, every vector we choose will have, under $T$, the form $T(v_{j}^{(r-2)})=\mu_{ji}u_{i}^{(r-1)}$. But since we want they to be linearly independent from the $u_{i}^{(r-1)}$ and $u_{i}^{(r-2)}$ we can choose them from $\ker T$, that is we can set $u_{j}^{(r-2)}=v_{j}^{(r-2)}-\mu_{ji}u_{i}^{(r-2)}$ and applying $T$ we obtain $T(u_{i}^{(r-2)})=T(v_{j}^{(r-2)})-\mu_{ji}T(u_{i}^{(r-2)})=\mu_{ji}u_{i}^{(r-1)}-\mu_{ji}u_{i}^{(r-1)}=0_{V}$. Then we only need to see they are linearly independent with the others. Let, again, a null linear combination $\alpha_{i}u_{i}^{(r-1)}+\beta_{i}u_{i}^{(r-2)}+\gamma_{j}u_{j}^{(r-2)}=0_{V}$. First we can apply $T$ both sides: $\alpha_{i}T(u_{i}^{(r-1)})+\beta_{i}T(u_{i}^{(r-2)})+\gamma_{j}T(u_{j}^{(r-2)})=\sum_{i}\alpha_{i}0_{V}+\beta_{i}u_{i}^{(r-1)}+\sum_{j}\gamma_{j}0_{V}=\beta_{i}u_{i}^{(r-1)}=0_{V}$ and therefore $\beta_{i}=0$ for every $i$ since $\{u_{i}^{(r-2)}\}$ is a basis. Then the initial expression takes the form $\alpha_{i}u_{i}^{(r-1)}+\gamma_{j}u_{j}^{(r-2)}=0_{V}$. Note that we have to sets of vectors that are in $\ker T$...
This is the point where I don't see a way to say that the $\alpha_{i},\gamma_{i}=0$ for every $i$ in order to say that they are linearly independent. Any kind of help (hints more than everything else) will be good.