Originally, I relied on some slick algebraic geometry techniques to understand this result. I've never actually gone through the details of flags before, and it turned out to be kind of tedious. You may be better off working out the details yourself than slogging through what I've written.
Let $F$ be a field, and let $V$ be an $n$-dimensional vector space over $F$. A (complete) flag $\mathbf F$ of $V$ is an ascending chain of $n$ distinct subspaces of $V$. Fix a basis $e_1, ... , e_n$ for $V$. Consider the flag consisting of the subspaces $V_1 \subseteq \cdots \subseteq V_n$ defined by $V_i = \textrm{span}\{e_1, ... ,e_i\}$.
If $T \in G = \textrm{GL}(V)$ is a vector space automorphism of $V$, then $T(V_1) \subseteq \cdots \subseteq T(V_n)$ is another flag. We say that $T$ preserves $\mathbf{F}$ if $T(V_i) = V_i$ for all $i$. In other words, $$\textrm{span}\{Te_1, ... , Te_i \} = \textrm{span} \{e_1, ... , e_i\}$$ for all $i$.
This is a straightforward exercise:
Lemma: $T$ preserves $\mathbf{F}$ if and only if the matrix of $T$ with respect to the basis $e_1, ... , e_n$ is upper triangular.
It follows that the subset $B$ of $\textrm{GL}(V)$ consisting of those $T$ which preserve $\mathbf F$ is a group, and if we define an isomorphism $\textrm{GL}(V) \rightarrow \textrm{GL}_n(F)$ by sending each $T$ to the matrix of $T$ with respect to the basis $e_1, ... , e_n$, then the image of $B$ is exactly the group of upper triangular invertible matrices.
Our goal is then to show that $B$ is self normalizing. Let $S, T \in G$. If $T \in SBS^{-1}$, then $S^{-1}TS \in B$, so for each $i$,
$$\textrm{span}\{S^{-1}TSe_1, ... , S^{-1}TSe_i \} = \textrm{span}\{e_1, ... , e_i \}$$
or equivalently,
$$\textrm{span}\{TSe_1, ... , TSe_i \} = \textrm{span}\{Se_1, ... , Se_i \}$$
Now, assume that $S$ normalizes $B$. Then for each $T \in B$, and each $i$, we have that
$$\textrm{span} \{TSe_1, ... , TSe_i \} = \textrm{span} \{Se_1, ... , Se_i \}$$
We are done if we can show that $\textrm{span} \{Se_1, ... Se_i \} = \textrm{span} \{e_1, ... , e_i\}$ for all $i$.
Let's represent $S$ is matrix form, as $(a_{ij})$. Then the condition $\textrm{span}\{ Se_1 \} = \textrm{span} \{TSe_1 \}$ tells us that for any choice of invertible upper triangular matrix $(c_{ij})$, the vectors
$$(c_{11}a_{11} + \cdots + c_{1n}a_{n1}, c_{22}a_{21} + \cdots + c_{2n}a_{n1}, ... , c_{nn}a_{n1} )$$
$$(a_{11}, ... , a_{n1})$$
are proportional. You can argue from here that $a_{21}, ... , a_{n1}$ have to be zero. If you don't believe me, I'll give you the statement of what you need to argue for the cases $n = 2$ and $n =3$, and you can see easily how this generalizes to all $n$.
$n = 2$: if $a, b$ are fixed elements of $F$, and $\alpha, \beta, \gamma$ are any elements of $F$ you want, with the requirement that $\alpha$ and $\gamma$ be nonzero, suppose for every such choice of $\alpha, \beta, \gamma$ that the vectors $(a,b)$ and $(\alpha a + \beta b, \gamma b)$ are proportional. Then $b = 0$.
$n = 3$: if $a, b, c$ are fixed elements of $F$, and $\alpha, \beta, \gamma, \delta, \epsilon, \lambda$ are any elements of $F$ you want, with the requirement that $\alpha, \delta, \lambda$ be nonzero, suppose for every such choice of Greek letters that the vectors
$$(a,b,c) = (\alpha a + \beta b + \gamma c, \delta b + \epsilon c, \lambda c)$$
are nonzero. Then $b = c = 0$.
Once this is out of the way, what you will have proved is that the span of $e_1$ is the same as that of $Se_1$. Or in other words, $S(V_1) = V_1$. Now pass to the quotient space $V/V_1$. This space has basis $\overline{e_2}, ... , \overline{e_n}$, the image of the old basis $e_1, ... , e_n$. We get a new flag on $V/V_1$ corresponding to this basis, namely $\overline{V_i} = V_i/V_1, i =2 ,... , n-1$. Since $S(V_1) = V_1$, $S$ induces an automorphism $\overline{S}$ on this space, and the assumption that $TS(V_i) = S(V_i)$ for all $T$ satisfying $T(V_i) = V_i$, tells you that $\overline{T} \overline{S}(\overline{V_i}) = \overline{S}(\overline{V_i})$, for all automorphisms $\overline{T}$ with the property that $\overline{T}(\overline{V_i}) = \overline{V_i}$.
Thus we are in the same situation as before, this time in the vector space $V/V_1$, with the basis $\overline{e_2}, ... , \overline{e_n}$, the flag corresponding to that basis, and an automorphism $\overline{S}$ of $V/V_1$ normalizing the group of elements in $\textrm{GL}(V/V_1)$ which preserve the flag $\overline{V_2}, ... , \overline{V_n}$. But $V/V_1$ is of smaller dimension, so by induction, we can conclude here what we intend to conclude for $V$, namely that $\overline{S}(\overline{V_i}) = \overline{V_i}$ for $i = 2, ... , n$.
This is the same as saying that $S(V_i)/V_1 = V_i/V_1$ for such $i$, which implies $S(V_i) = V_i$ by the correspondence theorem for quotient groups, since $S(V_i)$ contains $S(V_1) = V_1$. This completes the proof.