The first step of the proof of the chain rule in Rudin's Principles of Mathematical Analysis (Theorem 5.5, page 105) is as follows
Theorem. Suppose $f$ is continuous on $[a,b]$, $f'(x)$ exists at some point $x\in[a,b]$, $g$ is defined on an interval $I$ which contains the range of $f$, and $g$ is differentiable at the point $f(x)$. If $h(t)=g(f(t))\quad (a\leq t\leq b)$then $h$ is differentiable at $x$, and $h'(x)=g'(f(x))f'(x)$ Proof. Let $y=f(x)$. By the definition of the derivative, we have $f(t)-f(x)=(t-x)[f'(x)+u(t)]$ $ g(s)-g(y)=(s-y)[g'(y)+v(s)]$ where $t\in[a,b]$, $s\in I$, and $u(t)\rightarrow 0$ as $t \rightarrow x$, $v(s) \rightarrow 0$ as $s\rightarrow y$.
[...]
I think I can follow the rest from here, but I don't understand this manipulation. The definition of the derivative gives $f'(x)=\lim_{t\rightarrow x} \frac{f(t)-f(x)}{t-x}$ I can sort of see what's going on—it's a little like we're multiplying both sides of the equation by $t-x$ and $u(t)$ is there to make doing that make sense but I can't figure out how.