$E,F,G$ are assumed to be Banach spaces;
ORIGINAL QUESTION: the rightmost arrow in the diagram above, which I'll call $\omega$, has as its domain $H = L(F,G) \times L(E,F)$. However, $Dg \circ f$ does not have target space $H$. Neither does $Df$. So if this is straightforward composition of maps, then what map precedes $\omega$?
EDIT: With your help I'm getting closer.
(Lemma 1) Let $E,F$ be banach spaces and $U \subset E$ open. If $f,g:U \to F$ are of class $C^p$ then the map $f+g: U \to F \,\,;\,\, x \mapsto f(x)+g(x)$ is also of class $C^p$, and in particular $D^p(f+g) = D^pf + D^pg$. (This is proved in a pretty straightforward way using induction, and something to note is that the source and target spaces of $D^p(f+g), D^pf$, and $D^pg$ agree, so the summation expression involving iterated derivatives makes perfect sense.)
(Lemma 2) With $E,F,U$ as in the previous lemma, if $f:U \to F$ is a constant function, or a continuous linear function, than $f$ is smooth. (Again, straightforward - I have no questions about this).
Now to the proof of $\textbf{Theorem 6.5}$ using induction. I know how to prove the case where $p = 1$, so I'll skip to the inductive step.
Suppose $\textbf{Theorem 6.5}$ is true for some $p \geqslant 1$. Let $f,g$ be functions sourced and targeted as in the statement of the theorem, but now of class $C^{p+1}$. Then $Dg,f,Df$ are all of class $C^p$.
The map $\iota_1:L(F,G) \to L(F,G) \times L(E,F) := H \, \, ; \,\, \lambda \mapsto (\lambda, 0)$ is continuous and linear and thus smooth by the first lemma, so in particular is of class $C^p$ (note that $H$ has the product topology). Likewise, the map $\iota_2:L(E,F) \to H \,\, ; \,\, \mu \mapsto (0,\mu)$ is class $C^p$.
Using the inductive hypothesis, the composite $u = \iota_1 \circ Dg \circ f: U \to H$ is of class $C^p$, and the composite $v = \iota_2 \circ Df:U \to H$ is also of class $C^p$.
By the second lemma, the following sum is of class $C^p$:
$w := u+v: U \to H \,\,; \,\, x \mapsto (u(x),0)+(0,v(x)) = (u(x),v(x)) = (Dg(f(x)),Df(x)) $
Now, $\omega : H \to L(E,G) \,\, ; \,\, (\lambda, \mu) \mapsto \lambda \circ \mu$ is a continuous bilinear map, so it is smooth and necessarily of class $C^p$. Again using the inductive hypothesis, we have that $\omega \circ w : U \to L(E,G) \,\,; \,\, x \mapsto Dg(f(x)) \circ Df(x) = D(g \circ f)(x)$ is of class $C^p$. The conclusion is that $g \circ f$ is of class $C^{p-1}$. This completes the proof.
QUESTION: this proof relied on the smoothness continuous bilinear maps. Lang also has a proof of this:
I don't understand when he says "This proves the first assertion." Why does
$\frac{\omega(h,k)}{\|(h,k)\|} \to 0$
as $(h,k) \to 0$. All I know is that $\| \omega(h,k) \| \leqslant \| \omega \| \|h\| \|k \|$, but I don't think that is enough, because I don't think
$\frac{\|h\| \|k\| }{\|h\| + \|k\|}$
tends to zero as $(h,k) \to 0$.
