5
$\begingroup$

I'm working in Einsiedler and Ward's book on Ergodic Theory and in Exercise 2.5.4 they want to prove the following

$\lim_{N - M \to \infty} \frac{1}{N - M} \sum_{n = M}^{N - 1} U_T^n f \to P_T f.$

Now I'm wondering what this limit really says, it must be stronger than pointwise convergence since we can pick $M = 0$. It we let $N = 2n$ and $M = n$, then we seem to get more and more terms but adding them up from the "tail". To me it seems that this is a very strong convergence condition or am I wrong? What does the limit intuitively say?

1 Answers 1

7

First, let me provide the context: Given are a unitary operator $U \in L(H)$ (actually $\|U\| \leq 1$ suffices) on a Hilbert space and $P$ is the orthogonal projection onto the subspace $V = \{x \in H\,:\, Ux=x\}$ of fixed vectors of $U$. The von Neumann mean ergodic theorem asserts that for all $x \in H$ there is convergence of the Cesàro averages of $U^{k}x$ to $Px$ in the norm: \[ \left\Vert Px - \frac{1}{n+1} \sum_{k = 0}^{n} U^{k}x\right\Vert \; \xrightarrow{n \to \infty} \; 0. \] In other words, the sequence of operators $\frac{1}{n+1} \sum_{k = 0}^{n} U^{k}$ converges to $P$ in the strong operator topology on $L(H)$. Of course, convergence in the strong operator topology is the same as pointwise convergence of the operators on $H$, but I prefer not to speak of pointwise convergence because this might be confusing when speaking of a function space like $H = L^{2}$.

Spoiler: Note that if you have that, the exercise is already solved because you can multiply with $U^{M}$ inside the norm and write $N = n+M$ (or preferably do this backwards).


I'm not sure what kind of answer you're looking for. Formally, the result looks stronger but it is equivalent, as you noticed and I argued in the spoiler above.

I think you're already pretty close to the intuition I have about this. Let $S_{m}^{n} = \frac{1}{n-m+1}\sum_{k = m}^{n} U^{k}$ with $n \geq m$ (the $+1$ is rather immaterial but it should become clear from the stuff I say below why I prefer to add it). This is a triangular double sequence of operators in $L(H)$. As you say, taking $m = \text{const}$ you have convergence $\|S_{m}^{n}f - Pf\| \xrightarrow{n \to \infty} 0$. Now just requiring $(n-m) \to \infty$ means that only the increasing length of the tails matters eventually.


For me a slightly more abstract stance is helpful, but this may be due to the fact that I've thought about amenability too much. I'll give a brief account of this point of view anyway:

The Cesàro averages are probability measures on the abelian semi-group $\mathbb{N}$, namely $\mu_{n} = \frac{1}{n+1}\sum_{k=0}^{n} \delta_{k}$. The shift $Sk = k+1$ on $\mathbb{N}$ acts on the probability measures and \[ \|S\mu_{n} - \mu_{n}\| = \|\frac{1}{n+1} (\delta_{n+1} - \delta_{0})\| \leq \frac{2}{n+1} \to 0, \] where the norm is understood to be the total variation norm (or $\ell^{1}$-norm). Now in general, a sequence $(\lambda_{n})$ (or even a net) of probability measures on $\mathbb{N}$ is called approximately invariant (or a Reiter sequence) if $\|S\lambda_{n} - \lambda_{n}\| \to 0$.

On the other hand, the semigroup $\mathbb{N}$ acts on $H$ by $n \ast x = U^{n}x$. By pushing probability measures forward via the orbit map, we get an action on $H$ by the convolution semigroup $P(\mathbb{N})$ of probability measures on $\mathbb{N}$. Explicitly, $\mu \ast x = \sum_{n \in \mathbb{N}} \mu(n) U^{n} x$.

Let $(\lambda_{n})$ be an approximately invariant sequence (or net) of probability measures. Then \[ \|\lambda_{n} \ast (Ux - x)\| = \| (S\lambda_{n}) \ast x - \lambda_{n} \ast x\| \leq \|S\lambda_{n} - \lambda_{n}\|\,\| x\|\;\xrightarrow{n \to \infty} \;0 \] for all $x \in H$. Let $W$ be the closed linear span of the vectors $Ux - x$. We have just seen that $\lambda_{n} \ast w \to 0$ for all $w$ in a dense subspace of $W$, hence for all $w \in W$.

Moreover, it is not difficult to check that $W^{\perp} = V = \{x \,:\,x = Ux\}$. Indeed, if $y \in W^{\perp}$ then $0 = \langle y, x - Ux \rangle = \langle y - U^{\ast}y, x\rangle$ for all $x \in H$, so $y = U^{\ast}y$ and hence $Uy = y$ because $U$ is unitary. Therefore $V \supset W^{\perp}$ and the other inclusion is clear.

For every $x \in H$ we have $x = Px + (1-P)x$ with $Px \in V$ and $(1-P)x \in W$. Therefore $\lambda_{n} \ast x = \lambda_{n} \ast Px + \lambda_{n} \ast (1-P)x = Px + \lambda_{n} \ast (1-P)x \to Px$ and we have proved the following version of the mean ergodic theorem:

Theorem. If $(\lambda_{n})$ is an approximately invariant sequence (or net) of probability measures on $\mathbb{N}$ then $\lambda_{n} \ast x = \sum_{k \in \mathbb{N}} \lambda_{n}(k) U^{k}x$ converges to $Px$.


Finally, I come back to your specific question. The operators $S_{m}^{n}$ are easily seen to arise from the probability measures $\lambda_{m}^{n} = \frac{1}{n-m+1} \sum_{k=m}^{n} \delta_{k}$, which is a net using lexicographic ordering and is approximately invariant provided that $(n-m) \to \infty$.