13
$\begingroup$

The book I am working from (Introduction to Set Theory, Hrbacek & Jech) gives a proof of this result, which I can follow as a chain of implications, but which does not make natural, intuitive sense to me. At the end of the proof, I found myself quite confused, and had to look carefully at the build-up to see how the conclusion followed. I get the steps, now - but not the intuition.


The authors took sets $X$ and $Y$ and assumed injections $f: X \rightarrow Y$, $g: Y \rightarrow X$. Since $X \sim g[f[X]]$ and $X \supseteq g[Y] \supseteq g[f[X]]$, and $Y \sim g[Y]$, the authors went to prove the lemma that $A \sim A_1, A \supseteq B \supseteq A_1 \implies A \sim B$.

For the lemma, they defined recursive sequences $\{A_n\}_{n \in \omega}, \{B_n\}_{n \in \omega}$ by $A_0 = A, A_{n+1} = f[A],$ $ B_0 = B, B_{n+1} = f[B].$ Since $A_0 \supseteq B_0 \supseteq A_1 \implies f[A_0] \supseteq f[B_0] \supseteq f[A_1]$, we get from induction that $A_{n+1} \subseteq A_n$. Putting $\{C_n\}_{n \in \omega} = \{A_n - B_n\}_{n \in \omega},$ they noted that $C_{n+1} = f[C_n]$ (since $A_n \supseteq B_n$ inductively, $f[A_n - B_n] = f[A_n] - f[B_n]$). Putting

$ C = \bigcup_{n=0}^\infty C_n, \text{ } D = A - C,$

they noted that $f[C] = \bigcup_{n=1}^\infty C_n$, and that $h(x): A \rightarrow f[C] \cup D$ can be defined by sending $x \in C \rightarrow f(x),$ and $x \in D \rightarrow x$. It is clear that $h$ is one-to-one and onto.

Inductively, for $n > 0$, $C_0 \cap C_n = \varnothing$; it follows that $C_0 \cap \bigcup_{n=1}^\infty C_n = C_0 \cap f[C] = \varnothing$. We know that $C_0 \cup f[C] \cup D$ = A, and since all three sets are disjoint, we may conclude that $f[C] \cup D = A - C_0 = A - (A - B) = B$. Thus, our bijection $h$ maps $A$ to $B$, as we wanted.


What is the intuition here? What were H&J, or Cantor, Bernstein, etc. thinking when they went to prove this - the "high-level" idea? Is there a clearer proof, in the sense of thought process (not necessarily shorter)?

  • 0
    Little remark: If you just care about the result, there's a (in my opinion very beautiful) proof described on the [wiki page](https://en.wikipedia.org/wiki/Cantor–Bernstein–Schroeder_theorem#Proof) on the theorem. (I think it's clearer. Obviously it's also shorter.)2012-10-30

4 Answers 4

13

$\newcommand{\ran}{\operatorname{ran}}$Here’s one way to think about it. Suppose that $y\in\ran f$; then we can pull $y$ back to $f^{-1}(y)\in X$. If $f^{-1}(y)\in\ran g$, we can pull it back to $g^{-1}(f^{-1}(y))\in Y$. If we continue this pulling back, one of two things must happen: either we reach a dead end at a point of $X$ or $Y$ that can’t be pulled back (because it’s in $Y\setminus\ran f$ or $X\setminus\ran g$), or we don’t.

Let $X_0=X\setminus\ran g$, the set of points of $X$ that cannot be pulled back at all, and let $Y_0=Y\setminus\ran f$. More generally, for each $n\in\omega$ let $X_n$ be the set of points of $X$ that can be pulled back exactly $n$ times, and let $Y_n$ be the set of points of $Y$ that can be pulled back exactly $n$ times. Finally, let $X_\omega$ and $Y_\omega$ by the subsets of $X$ and $Y$, respectively whose points can be pulled back infinitely many times.

At this point a sketch helps; it should show the partitions $\{X_n:n\le\omega\}$ of $X$ and $\{Y_n:n\le\omega\}$ of $Y$, and it should include arrows indicating what parts of $X$ get mapped to what parts of $Y$ and vice versa. To avoid having arrows crossing, I’ve taken $X$ and $Y$ apart in the following diagram.

$\begin{array}{} X_0&\overset{f}\longrightarrow&Y_1&\overset{g}\longrightarrow&X_2&\overset{f}\longrightarrow& Y_3&\overset{g}\longrightarrow&X_4&\dots&X_\omega\\ Y_0&\overset{g}\longrightarrow&X_1&\overset{f}\longrightarrow&Y_2&\overset{g}\longrightarrow&X_3&\overset{f}\longrightarrow&Y_4&\dots&Y_\omega \end{array}$

Each of the arrows is a bijection, so I can break up the diagram into $\omega$ self-contained parts. The first two parts are:

$\begin{array}{} X_0&\overset{f}\longrightarrow&Y_1\\ Y_0&\overset{g}\longrightarrow&X_1 \end{array}\qquad \begin{array}{} X_2&\overset{f}\longrightarrow&Y_3\\ Y_2&\overset{g}\longrightarrow&X_3 \end{array}$

Ignoring $X_\omega$ and $Y_\omega$ for the moment, I can rearrange the rest of the diagram to give my a bijection from $X\setminus X_\omega$ to $Y\setminus Y_\omega$:

$\begin{array}{ccc} X_0&\overset{f}\longrightarrow&Y_1\\ X_1&\overset{g^{-1}}\longrightarrow&Y_0\\ X_2&\overset{f}\longrightarrow&Y_3\\ X_3&\overset{g^{-1}}\longrightarrow&Y_2\\ \vdots&\vdots&\vdots\\ X_{2k}&\overset{f}\longrightarrow&Y_{2k+1}\\ X_{2k+1}&\overset{g^{-1}}\longrightarrow&Y_{2k}\\ \vdots&\vdots&\vdots \end{array}$

Finally, I claim that $f[X_\omega]=Y_\omega$: everything in $X_\omega$ can be pulled back infinitely often, so everything in $f[X_\omega]$ can be pulled back infinitely often, and therefore $f[X_\omega]\subseteq Y_\omega$. On the other hand, if $y\in Y_\omega$, then $y$ can be pulled back infinitely often, so it must be possible to pull $f^{-1}(y)$ back infinitely often, and therefore $f^{-1}(y)\in X_\omega$. Thus, $Y_\omega\subseteq f[X_\omega]$ as well. The diagram above can now be completed to show a bijection from $X$ onto $Y$:

$\begin{array}{ccc} X_0&\overset{f}\longrightarrow&Y_1\\ X_1&\overset{g^{-1}}\longrightarrow&Y_0\\ X_2&\overset{f}\longrightarrow&Y_3\\ X_3&\overset{g^{-1}}\longrightarrow&Y_2\\ \vdots&\vdots&\vdots\\ X_{2k}&\overset{f}\longrightarrow&Y_{2k+1}\\ X_{2k+1}&\overset{g^{-1}}\longrightarrow&Y_{2k}\\ \vdots&\vdots&\vdots\\ X_\omega&\overset{f}\longrightarrow&Y_\omega \end{array}$

The bijection is defined piecewise, but that’s no problem.

There are a few details to be filled in to make this a fully rigorous proof, but I think that it does give a reasonable idea of one possible intuition.

Added: Here’s a very rough sketch. Arrows from left to right are (parts of) $f$, and arrows from right to left are (parts of) $g$.

enter image description here

  • 0
    This looks like Eilenberg's Swindle. See http://en.wikipedia.org/wiki/Eilenberg%E2%80%93Mazur_swindle#Other_examples2013-09-20
8

It is a reasonably long and seemingly complicated proof, but actually the idea is very simple (or at least it is in the proof I've seen). We have injections $f:X\to Y$ and $g:Y\to X$. To build a bijection $h:X\to Y$ we need to send each point in $X$ to a unique point in $Y$, making sure we hit everything.

We can start by saying $h(x) = f(x)$. But of course, this doesn't hit everything if f is not surjective. But , for each element $y$ we don't hit in $Y$ (i.e. the set ${Y\setminus{f(X)}}$) there is a unique element $g(y)$ in $X$ which we can make map to $y$. So we edit $h$ such that now $h(x) = \begin{cases} g^{-1}(x) & \mbox{if } g^{-1}(x) \notin f(X) \\ f(x) & \mbox{otherwise} \end{cases}$

So this time we hit everything we didn't get first time from the $g^{-1}(x) $ part. But now we're missing some other parts! (namely the values $f(x)$ where $x \in g(Y\setminus f(X))$ but $f(x) \notin Y\setminus f(X)$.) We can define the sets that we "miss" in each iteration of improving $h$ as $C_n$ and say $C = \displaystyle \bigcup_1^{\infty}C_n$. Then defining $h(x) = \begin{cases} g^{-1}(x) &\mbox{if } g^{-1}(x) \in C \\ f(x) &\mbox{otherwise} \end{cases}$

And by thinking of how we built it up, it kind of makes a bit more sense as to why it is a bijection.

Please note that this isn't the most efficient way to carry out the proof, but I explained it in the way that I would come up with it whilst trying to see why it works, rather than what I'd write down formally. Also please note it's been a long time since I've seen the proof and I may have gotten some of the specifics wrong (please anyone correct me if this is the case!), but this is the right general idea (I hope!)

8

My favorite proof is due to R. H. Cox, and I learned it from the wonderful Handbook of Analysis and its Foundations by Schechter, from which I've taken the illustration below. The basic idea is that an injection from $Y$ to $X$ identifies $Y$ with a subset of $X$. So we can reduce the problem to showing that if we have $Y\subseteq X$ and $f:X\to Y$ is an injection, then $|X|=|Y|$ and this is less confusing. let $C=X\backslash Y$. The sequence of sets $\big(f^n(C))$ is disjoint since $f(X)\subseteq Y$ and $f$ is injective. Let $S=\bigcup_{n=0}^\infty f^n(C)$. Define $g:X\to Y$ by

$g(x) = \begin{cases} f(x)&\mbox{if } x\in S \\ x & \mbox{if }x\notin S. \end{cases}$

It is easily verified that $g$ is a bijection.

Remark: According to Kunen's book on the foundations of mathematics, Dedekind already proved this version of the Cantor-Bernstein theorem in 1887.

enter image description here

  • 0
    @madmatician Basically, in order to make the proof comprehensible in terms of the picture. The proof doesn'tstrictly require it, the intuition IMHO does.2016-08-06
0

When Y is a subset of X, Y can be assumed a proper subset of X otherwise the result is trivial. f is then called a reflection. If Z is a subset of X that contains Y then X is partitioned into Z and the rest of X which can be imagined as a frame of Z. This partitioning is carried over by f to Y, and then again to the image of Y under f and so on. Thus we obtain a sequence of disjoint frames and a sequence of nested subsets of Z. Picture the frames as an infinite stack. f then pushes the stack one level down. So if you take f on the stack of frames and the identity on the subsets you get a 1-1 mapping from X onto Z.

This is the essence of Dedekind's proof mentioned above. cf. my book http://link.springer.com/book/10.1007/978-3-0348-0224-6/page/1