Here is a very brief sketch of how type theory can be studied from the point of view of category theory.
The main idea is that the type system of a programming language is a category: to be more precise,
- types are objects, and
- functions of one variable are morphisms.
It is then clear that type casting, like any other function, is just a morphism in this category.
Still, this interpretation is somewhat unsatisfactory since we frequently deal with functions of many variables. There are two ways around this: we could either define the cartesian product of types, or we could define function types (i.e. currying). Or we could do both. This leads to the notion of a cartesian closed category. The semantics of most progamming languages are not easy to analyse by category-theoretic means, so we should focus on pure functional programming languages with no side effects.
Now, a functor $F : \mathcal{C} \to \mathcal{D}$ comprises the following data:
- a mapping of objects of $\mathcal{C}$ to objects of $\mathcal{D}$, and
- a mapping of morphisms $X \to Y$ in $\mathcal{C}$ to morphisms $F X \to F Y$ in $\mathcal{D}$, such that identities and compositions are preserved.
If $\mathcal{C}$ is a type system for a programming language, then a functor $\mathcal{C} \to \mathcal{C}$ can (sometimes) be regarded as a type constructor. Here are some examples of functors:
The trivial functor $\bot : \mathcal{C} \to \mathcal{C}$ that sends every type to the bottom type $\bot$ and every morphism to the identity map $\textrm{id} : \bot \to \bot$ is a functor. More generally one can construct a constant functor for any type in this way.
The cartesian product functor $- \times -: \mathcal{C} \times \mathcal{C} \to \mathcal{C}$ takes two types to their cartesian product. This can be regarded as the type constructor for pairs.
The list functor $\textrm{List}(-) : \mathcal{C} \to \mathcal{C}$ takes a type $X$ to the type of lists of $X$. This is a functor: its action on morphisms is better known as higher-order function ‘map’.
The last one is an example of a monad. To explain what a monad is, we first need to know what a natural transformation is: given two functors $F, G : \mathcal{C} \to \mathcal{D}$, a natural transformation $\alpha : F \Rightarrow G$ is a family of morphisms $\alpha_X : F X \to G X$ indexed by objects $X$ of $\mathcal{C}$ such that for all morphisms $f : X \to Y$ in $\mathcal{C}$, we have $\alpha_Y \circ F f = G f \circ \alpha_X$ (This is usually expressed by means of a commutative diagram.)
For example, there is a natural transformation from the identity functor to the trivial functor $\bot$, because every type $X$ admits a unique function $X \to \bot$, and the uniqueness means that the naturality condition is automatically satisfied.
A monad is a functor $T : \mathcal{C} \to \mathcal{C}$ together with natural transformations $\eta : \textrm{id}_\mathcal{C} \Rightarrow T$ and $\mu : T^2 \Rightarrow T$ such that, for all types $X$, $\begin{align} \mu_X \circ T \eta_X & = \textrm{id}_X & \mu_X \circ \eta_{T X} & = \textrm{id}_X & \mu_X \circ T \mu_X & = \mu_X \circ \mu_{T X} \end{align}$
We think of monads as constructors for wrapper types via the natural transformation $\eta$. Monads have the property that a multiply-wrapped type can be reduced to a singly-wrapped type in a coherent way, via the natural transformation $\mu$. Concretely, for the list monad, we have:
- $\eta_X : X \to \textrm{List}(X)$ is the function that takes an $x$ and returns the singleton list $\langle x \rangle$.
- $\mu_X : \textrm{List}(\textrm{List}(X)) \to \textrm{List}(X)$ is the function that concatenates lists; so for example, $\mu_\mathbb{N} (\langle \langle 1, 2 \rangle, \langle 3, 4, 5 \rangle \rangle) = \langle 1, 2, 3, 4, 5 \rangle$.
It is easy to check that these satisfy the axioms for a monad.
Finally, we have the notion of an algebra for a monad. If $(T, \eta, \mu)$ is a monad, then a $T$-algebra is an object $X$ together with a morphism $\alpha : T X \to X$ such that $\begin{align} \alpha \circ \eta_X & = \textrm{id}_X & \alpha \circ \mu_X & = \alpha \circ T \alpha \end{align}$
We think of these as being very well-behaved fold functions. For example, the function $\Sigma : \textrm{List}(\mathbb{N}) \to \mathbb{N}$ which sums up a list of numbers makes $\mathbb{N}$ into an algebra for the list monad. One imagines that this associativity property can be used to automatically parallelise certain computations. On the other hand, the averaging function $\textrm{List}(\mathbb{R}) \to \mathbb{R}$ is not an algebra for the list monad because the result depends on how the list is partitioned into sublists.