I am looking at the proof of the following version of the Hille-Yosida theorem given in Ethier and Kurtz' Markov Process.
Some Definitions.
Here, $A$ is a (multivalued) linear operator on $L$, i.e. a subset $A$ of $L \times L$ with domain $D(A)=\{f:(f,g)\in A \;\text{for some}\;g\}$ and range $R(A)=\{g: (f,g)\in A \; \text{for some}\;f\}$. $A\subset L\times L$ is said to be linear if $A$ is a subspace of $L\times L$. If $A$ is linear, then $A$ is said to be single-valued if $(0,g)\in A$ implies $g=0$, in this case, $A$ is a graph of a linear operator on $L$, also denoted by $A$, so we write $Af=g$ if $(f,g)\in A$. If $A\subset L\times L$ is linear, then $A$ is said to be dissipative if $\Vert \lambda f-g\Vert \ge \lambda \Vert f\Vert $ for all $(f,g)\in A$ and $\lambda >0$; the closure $\bar{A}$ of $A$ is just the closure in $L\times L$ of the subspace $A$. Finally, we define $\lambda - A=\{(f,\lambda f-g):(f,g)\in A\}$ for each $\lambda >0$.
Lemma preceding the theorem.
Now in the below part of the proof, I don't understand why $\Vert \lambda(\lambda - \bar{A})^{-1}f-f\Vert = \Vert(\lambda - \bar{A})^{-1}g\Vert$ which is above (4.2). Also, why is the domain of $(\lambda - \bar{A})^{-1}$ given as $R(\lambda - A_0)$? Shouldn't it be $R(\lambda-\bar{A})$ by definition? Finally, I don't understand why the range is $D(A_0)$ either. I would greatly appreciate if anyone could explain these to me.



