The philosophy of this paper is: if the solutions of $\mathcal A_0u=0$ are nice, and $\mathcal {A}$ is an operator sufficiently close to $\mathcal A_0$, then the solutions of $\mathcal{A}u=0$ are okay. Theorem A is a precise formulation of this philosophy.
Theorem B follows from Theorem A by taking $\mathcal A_0=\Delta$, the Laplacian. First we verify that the solutions of Laplacian (harmonic functions) are nice, namely that (H1) holds. Indeed, if $u$ is harmonic, then $|\nabla u|^2$ is subharmonic (you can just do the computation, or argue by convexity). The sub-mean value property says that $| \nabla u(x)|^{2} \le \frac{1}{|B|} \int_{B} | \nabla u(y)|^2 dy \tag{SUB}$ for any ball $B$ centered at $x$. To see that (1.1) holds, apply (SUB) to any point $x\in Q$, choosing $B$ to be the ball of radius equal to the half the sidelength of $Q$. Then $B\subset 2Q$, and their measures are about the same, so we get $\| \nabla u\|^{2}_{L^{\infty}(Q)} \le \frac{C}{|2Q|} \int_{2Q} | \nabla u(y)|^2 dy$ Hopefully this explains the "For the Laplacian we have the classical estimate..." part.
Then we use Lemma 2.1 to make sure that (H2) holds when the coefficients $\mathcal A$ are sufficiently close to the identity. So much for the proof of Theorem B.
Next, the corollary. We have $\mathrm{div}\, A\nabla u=0$ where $A=A(x)$ has continuous coefficients. We want to prove that $u$ is locally in $W^{1,p}$ for all $p<\infty$. Given any point $x_0$ in the domain, let $A_0=A(x_0)$. Make the linear change of variable $v=u\circ A_0^t$. The chain rule says $\nabla v=A_0 \nabla u$. Hence, $v$ satisfies $\mathrm{div}\, \tilde A\nabla v=0$ where $\tilde A(x)=A(x)A_0^{-1}$.
Note that $\tilde A(x)$ also has continuous coefficients, and $\tilde A(x_0)=I$. By continuity, $\tilde A(x)$ is close to $I$ when $x$ is close to $x_0$. Therefore, Theorem B applies and says that $v$ is locally in $W^{1,p}$. The same holds for $u$, which is just the composition of $v$ with a linear map.
(Added)
If $u$ is harmonic, then each partial derivative such as $u_{x}$ is also harmonic because $\Delta (u_x)=(\Delta u)_{x}=0$. Furthermore, the square of any harmonic function $v$ is subharmonic because $\Delta (v^2)= \mathrm{div}\,\nabla (v^2) = \mathrm{div}\, (2v \nabla v) = 2|\nabla v|^2+2v\, \mathrm{div}\, (\nabla v) =2|\nabla v|^2+2v\, \Delta v = 2|\nabla v|^2\ge 0 $ Finally, $|\nabla u|^2$ is the sum of squares of partial derivatives and therefore is subharmonic.
The authors do not say that their "$u\in W^{1,p}$" results are local. But they are local, unless additional assumptions on boundary regularity are made. One cannot get global regularity in arbitrary bounded domains by cube-based estimates used in this paper. Here is a concrete example: in two dimensions let $\Omega$ be $\{(x,y):0. The function $u(x,y)=\log (x^2+y^2)$ belongs to $W^{1,2}(\Omega)$, which you can check by integrating $|\nabla u|^2\approx 1/x^2$ within $\Omega$. We have $\Delta u=0$, which is the nicest equation one could ask for. However, $u$ does not belong to $W^{1,3}(\Omega)$, which you also can check by integration. Therefore, the global version of Theorem B does not hold (without extra hypotheses on the smoothness of $\partial\Omega$).
By Hölder's inequality, $u\in W^{1,p}$ implies $u\in W^{1,s}$ for any $1\le s, as long as the domain of integration has finite measure. Since only bounded domains are considered here, there is no need to worry about $p<2$.