I've been able to prove only the special case. Here it is, together with a host of useful theorems and definition used, directly or indirectly, in its proof (Theorem 6.2). Only a handful of results are proved (Theorem 6.2 being one of them), since the rest were deemed straight-forward.
You may also wish to consult some of the sources listed in the following math stack exchange post: The matrix exponential: Any good books?
Definition 1 We shall use $\mathbf{M}$ to denote the class of infinite dimensional, real valued matrices as described in the original post. Unless explicitly stated otherwise, We shall use the words matrix, matrices exclusively to denote members of $\mathbf{M}$.
Conventions Capital English letters, possibly adorned with sub-/superscripts, will be used exclusively to designate matrices. Lower-case English letters, possibly adorned with sub-/superscripts, will be used exclusively to designate real numbers. Equalities will always imply existence, for instance if we write: "Let $AB=C$", we mean "Suppose the product of $A$ and $B$, in this order, is well defined and equals $C$" and if we write "... then $\sum_{n\in\mathbb{N}_0}A_n=B$" we mean "... then the series $\sum_{n\in\mathbb{N}_0}A_n$ converges and sums to $B$."
Theorem 1 $\mathbf{M}$ is a real vector space with respect to the operations of scalar multiplication and addition described in the original post and with $0$ (likewise described in the original post) serving as the neutral element w.r.t. matrix addition.
Definition 2
- $|A|:=(|A_{i,j}|)_{i,j\in\mathbb{N}_0}$. 
- $0\leq A$ shall mean that $A=|A|$. 
- $A\leq B$ shall mean that $0\leq B-A$. 
- $A$ is bounded iff $\{\left.A_{i,j}\space:\right|\space i,j\in\mathbb{N}_0\}$ is bounded. 
Theorem 2 Multiplication of matrices
- A Multiplicative neutral element. $I$ is the unique multiplicative neutral element w.r.t. matrix multiplication. 
- Associativity w.r.t. scalar multiplication. If $a\neq0$ $(aB)C=D\iff a(BC)=D\iff B(aC)=D$. If $a=0$, the implications whose hypothesis $a(BC)=D$ is, remain value, but their converses are true only when the product $BC$ is well-defined. 
- Distributivity. (i) If $A_1B=C_1$ and $A_2B=C_2$, then $(A_1+A_2)B=C_1+C_2$, (ii) If $BA_1=C_1$ and $BA_2=C_2$, then $B(A_1+A_2)=C_1+C_2$ 
- Associativity. If either $$|A||B|=P\mathrm{\, and\, }P|C|=Q$$ or $$|B||C|=S\mathrm{\, and\, }A|S|=T$$ there's some $D$ such that $$(AB)C=D=A(BC)$$ 
- The Binomial theorem. Let $m\in\mathbb{N}_0$. If $0\leq A,B$ commute (i.e. $AB=C=BA$ for some $C$), and for every $n,k\in\mathbb{N}_0$ such that $n+k\leq m$, $A^nB^k=C_{n,k}$, then
$$(A+B)^{m}=\sum_{n=0}^m\binom{m}{n}C_{n,m-n}$$ 
Theorem 3 Scalar multiplication of infinite sequences of matrices
- If $\lim_{n\rightarrow\infty}a_n=b$, $\lim_{n\rightarrow\infty}(a_nC)=bC$. 
- If $\lim_{n\rightarrow\mathbb{N}_0}A_n=B$, $\lim_{n\rightarrow\infty} (cA_n)=cB$. 
- If $\sum_{n=0}^\infty A_n=B$, $\sum_{n=0}^\infty cA_n=cB$. 
Theorem 4 Absolute convergence and rearrangement of a series
- If $\sum_{n=0}^\infty A_n=B$ and $A'=(A_0,0,A_1,0,\dots)$, $$\sum_{n=0}^\infty A'_n=B$$ 
- If $\sum_{n=0}^\infty |A_n|=B$, $\sum_{n=0}^\infty A_n=C$ for some $C$. 
- If $\sum_{n=0}^\infty |A_n|=B$ and $(A_{n_i})_{i\in\mathbb{N}_0}$ is a rearrangement of $(A_n)_{n\in\mathbb{N}_0}$, then $\sum_{i=0}^\infty A_{n_i}=\sum_{n=0}^\infty A_n$. 
- If $(A_{i,j})_{i,j\in\mathbb{N}_0}$ is a rearrangement of $(A_n)_{n\in\mathbb{N}_0}$, then $\sum_{i=0}^\infty\sum_{j=0}^\infty|A_{i,j}|=B$ iff $\sum_{n=0}^\infty |A_n|=B$ and then
$$\sum_{i=0}^\infty\sum_{j=0}^\infty A_{i,j}=\sum_{j=0}^\infty\sum_{i=0}^\infty A_{i,j}=\sum_{n=0}^\infty A_n$$ 
Definition 3 Stochastic Matrix. A is stochastic iff $0\leq A$ and for all $i\in\mathbb{N}_0$, $\sum_{j=0}^\infty A_{i,j}=1$.
Theorem 5 Stochastic matrices
- If $A, B$ are stochastic, there's a stochastic matrix $C$, such that $AB=C$. 
- If $A$ is stochastic, $e^{tA}=B$ for some $B$ and for all $i,j\in\mathbb{N}_0$, $|B_{i,j}|\leq e^t$. 
- $e^{tI}=e^tI$ 
Proof of Theorem 5.1
We shall make use of the following lemma, whose proof is left for the reader.
Lemma Let $0\leq A$ and let $(a_n,b_n)$ be a sequence of pairs of natural ($\mathbb{N}_0$) numbers such that $\lim_{n\rightarrow\infty}a_n=\lim_{n\rightarrow\infty}b_n=\infty$. Then $\sum_{i=0}^\infty\sum_{j=0}^\infty A_{i,j}=s$ for some $s\in\mathbb{R}$ iff $\lim_{n\rightarrow\infty}\sum_{i=0}^{a_n}\sum_{j=0}^{b_n}A_{i,j}=t$ for some $t\in\mathbb{R}$, and then $s=t$.
Now, let $A, B$ be stochastic. It is easy to see that $C:=AB$ is well defined and $0\leq C$. All that remains to show is that each of $C$'s rows sums to $1$. Let $i\in\mathbb{N}_0$ be a row index. We need to show that $\sum_{j=0}^\infty\sum_{k=0}^\infty A_{i,k}B_{k,j}=1$. For every $k\in\mathbb{N}_0$, let $a_k\in\mathbb{N}_0$ be an ascending sequence such that $\sum_{j=0}^{a_k}B_{k,j}>1-\frac{1}{k+1}$ and set $b_k:=k$. Then $a_k,b_k\underset{k\rightarrow\infty}{\rightarrow}\infty$ and so, by the lemma, it is enough to show that $\lim_{n\rightarrow\infty}C_n=1$ with $C_n:=\sum_{j=0}^{a_n}\sum_{k=0}^{b_n}A_{i,k}B_{k,j}$. Indeed,
$$\underbrace{(1-\frac{1}{n+1})\sum_{k=0}^{b_n}A_{i,k}}_{\underset{n\rightarrow\infty}{\rightarrow}1}\leq C_n=\sum_{k=0}^{b_n}A_{i,k}\sum_{j=0}^{a_n}B_{k,j}\leq\underbrace{\sum_{k=0}^{b_n}A_{i,k}}_{\underset{n\rightarrow\infty}{\rightarrow}1}$$
$\square$
Definition 4 Exponential partial sums. For all $n\in\mathbb{N}_0, t\in\mathbb{R}$, define
- $\exp_n(t):=\sum_{i=0}^n \frac{t^n}{n!}$ 
- $\mathrm{Exp}_n(t):=\exp_n(t) I$ 
Theorem 6 Additivity of the matrix exponential
- $\sum_{n=0}^\infty(\frac{t^n}{n!}A)=e^tA$ 
- If $A$ is stochastic, $e^{sI}e^{tA}=e^{sI+tA}$ 
Proof of Theorem 6.2
Let $A$ be stochastic. Then
$$\begin{align}e^{sI}e^{tA}&=\sum_{n=0}^\infty\left(\frac{s^n}{n!}e^{tA}\right)\\
&=\sum_{n=0}^\infty\sum_{k=0}^\infty\frac{s^n}{n!}\frac{t^k}{k!}A^k\\
&=\sum_{m=0}^\infty\frac{1}{m!}\sum_{n,k\in\mathbb{N}_0\atop n+k=m}\binom{m}{n}(sI)^n(tA)^k\\
&=\sum_{m=0}^\infty\frac{1}{m!}\left(sI+tA\right)^m\\
&=e^{sI+tA}\end{align}$$
$\square$
Definition 5 The derivative of a function $\mathbb{R}\rightarrow\mathbf{M}$
Let $\phi:\mathbb{R}\rightarrow\mathbf{M}$ be a matrix-valued function on the real line. For every $i,j\in\mathbb{N}_0$ define the function-valued matrix $\Phi$ whose members are real-valued functions on the real line by
$$\Phi_{i,j}:\mathbb{R}\rightarrow\mathbb{R},\space\space\Phi_{i,j}(x):=(\phi(x))_{i,j}$$
If $\Phi_{i,j}$ is differentiable at $x_0\in\mathbb{R}$ for all $i,j\in\mathbb{N}_0$, we say that $\phi$ is differentiable at $x_0$, and its derivative at this point is defined to be the matrix
$$\phi'(x_0):=\left(\Phi'_{i,j}(x_0)\right)_{i,j\in\mathbb{N}_0}$$
Theorem 7 Properties of the matrix exponential
- $e^0=I$ 
- If $e^{tA}=B$ for some $0 < t$, then for every $s\in(-t,t)$ there's some $C_s$ such that $e^{sA}=C_s$. 
- If $0\leq A$ and $e^A=B$ for some $B$, $e^{-A}=C$ for some $C$. 
- If $e^{tA}=B$ for some $0 < t$, the matrix function
$$\phi:\mathbb{R}\rightarrow\mathbf{M},\space\space \phi(s):=e^{sA}$$
is differentiable in the domain $(-t,t)$, and $\phi'(0)=A$. 
Definition 6 Infinitesimal generators. An infinitesimal generator is an infinite real-valued matrix $A$ that satisfies the following three conditions
i) $A_{i,j}\geq0$ for all $i,j\in\mathbb{N}_0$,
ii) $A_{i,i}=-\sum_{j\neq i}A_{i,j}$ for all $i\in\mathbb{N}_0$
(Infinitesimal generators arise in the theory of probability in the context of continuous-time/discrete-state-space Markov processes.) 
Theorem 8 Infinitesimal generator. If $A$ is a bounded infinitesimal generator, there's some stochastic matrix $B$ and some number $0\leq c$ such that
$$e^{tA}=e^{-ct}e^{ctB}$$
In fact, you may choose any $c\geq\sup_{i\in\mathbb{N}_0}|A_{i,i}|$ and
$$B = \begin{cases} c^{-1} A+I &, 0 < c \\
I&,\mathrm{otherwise} \end{cases}$$
In particular, $e^{tA}$ converges for all t.
Proof See the proof of Klenke's Theorem 17.25. $\square$
Theorem 9 The product of an infinitesimal generator and a bounded matrix
- $A$ is an infinitesimal generator iff $A+I$ is stochastic. 
- If $A$ is an infinitesimal generator, so is $rA$. If $A$ is bounded, so is $rA$. 
- If $A$ is an infinitesimal generator and $B$ is bounded, $AB=C$ for some $C$. If $A$ is bounded then $C$ is bounded. 
- Let $(A^{(n)})_{n\in\mathbb{N}_0}$ be a sequence of infinitesimal generators and let $B$ be bounded. If $\lim_{n\rightarrow\infty}A^{(n)}=C$ and $C$ is an infinitesimal generator, $\lim_{n\rightarrow\infty}(A^{(n)}B)=CB$. 
Proof of Theorem 9.3
Let $i,j\in\mathbb{N}_0$. We need to show that
$$\lim_{n\rightarrow\infty}\sum_{k=0}^\infty A_{i,k}^{(n)}B_{k,j}=\sum_{k=0}^\infty C_{i,k}B_{k,j}$$
Let $n,N\in\mathbb{N}_0$ be arbitrary, $i < N$.
$$\begin{align}\left|\sum_{k=0}^\infty A_{i,k}^{(n)}B_{k,j}-\sum_{k=0}^\infty C_{i,k}B_{k,j}\right|&=\underbrace{\left|A_{i,i}^{(n)}-C_{i,i}\right|}_{\underset{n\rightarrow\infty}{\rightarrow}0}\left|B_{i,j}\right|+\sum_{k=0\atop k\neq i}^N\underbrace{\left|A_{i,k}^{(n)}-C_{i,k}\right|}_{\underset{n\rightarrow\infty}{\rightarrow}0}\left|B_{k,j}\right|\\&+\left|\sum_{k=N+1}^\infty(A_{i,k}^{(n)}-C_{i,k})B_{k,j}\right|\end{align}$$
Now, if $0\leq\beta\in\mathbb{R}$ is an upper bound on $B$,
$$\begin{align}\left|\sum_{k=N+1}^\infty(A_{i,k}^{(n)}-C_{i,k})B_{k,j}\right|&\leq\left(\sum_{k=N+1}^\infty A_{i,k}^{(n)}\right)\beta+\left(\sum_{k=N+1}^\infty C_{i,k}\right)\beta\\&=\left(C_{i,i}+\underbrace{A_{i,i}^{(n)}-C_{i,i}}_{\underset{n\rightarrow\infty}{\rightarrow}0}\right)\beta\\&-\left(\sum_{k=0\atop k\neq i}^N C_{i,k}+\sum_{k=0\atop k\neq i}^N(\underbrace{A_{i,k}^{(n)}-C_{i,k}}_{\underset{n\rightarrow\infty}{\rightarrow}0})\right)\beta\\&+\left(\sum_{k=N+1}^\infty C_{i,k}\right)\beta\\&\leq2\left(\sum_{k=N+1}^\infty C_{i,k}\right)\beta+\varepsilon\end{align}$$
where both summands in the last expression can be made gratuitously small by choosing $N$, $n$ large enough.
$\square$
Definition 7 Component-wise limit of a matrix function
Given a set of numbers $\emptyset\neq S\subseteq\mathbb{R}$, a matrix function
$$f:S\rightarrow\mathbf{M}$$
and an accumulation point of $S$, $s\in\mathbb{R}\cup\{\pm\infty\}$ [If $s=\infty$, $s$ is an accumulation point of $S$ iff $S$ has no upper bound. If $s=-\infty$, $s$ is an accumulation point of $S$ iff $S$ has no lower bound],
$$\lim_{t\rightarrow s\atop t\in S}f(t)=A$$
iff for all $i,j\in\mathbb{N}_0$,
$$\lim_{t\rightarrow s\atop t\in S}\left(f(t)\right)_{i,j}=A_{i,j}$$
Theorem 10 Discretization of the limiting process
Under the assumptions of Definition 7,
$$\lim_{t\rightarrow s\atop t\in S}f(t)=A$$
iff for every sequence $(t_n)_{n\in\mathbb{N}_0}$ in $S$ that converges to $s$,
$$\lim_{n\rightarrow\infty}f(t_n)=A$$
Definition 8 Markov semigroup
A matrix function
$$f:[0,\infty)\rightarrow\mathbf{M}$$
is a Markov semigroup iff
- $f(0)=I$ 
- For all $t\in(0,\infty)$, $f(t)$ is stochastic 
- For all $s,t\in[0,\infty)$, $$f(s+t)=f(s)f(t)$$ 
(Markov semigroups arise in probability theory in the context of continuous-time/discrete-state-space Markov processes.)
Theorem 11 Right-hand derivative of a Markov semigroup
Let $f:[0,\infty)\rightarrow\mathbf{M}$ be a Markov semigroup and let $A$ be an infinitesimal generator, such that
$$\lim_{t\downarrow0}\frac{1}{t}(f(t)-I)=A$$
Then $f$ is right-hand differentiable on $[0,\infty)$ and for all $t\in[0,\infty)$,
$$\mathrm{D}_R(f, t)=Af(t)$$
where $\mathrm{D}_R(f, t)$ is the right-hand derivative of $f$ at $t$.
Proof
Let $t\in[0,\infty)$. Then
$$\lim_{s\downarrow0}\frac{1}{s}(f(t+s)-f(t))=\lim_{s\downarrow0}\left(\frac{1}{s}(f(s)-I)\space f(t)\right)=Af(t)$$
$\square$