I think you're getting a handle on the basic idea, but there are a number of details in your statements that aren't quite correct.
Given a manifold $M$ of dimension $n$, I have that a distribution,
$\mathcal{D}$, is a sub-bundle (subspace) of the tangent space $TM$
such that $\mathcal{D}\subset TM$ and $\mathcal{D}_p\subset T_pM$ for
all $p\in M$.
Yes, a distribution is a subbundle of $TM$. But it's not really correct to say it's a subspace of $TM$. Of course, "subspace" can have different meanings, depending on context. It is true that $\mathcal D$ is a topological subspace of $TM$ if we endow it with the subspace topology, but then that's true of any subset of $TM$, or of any other topological space. On the other hand, it wouldn't make sense to say that $\mathcal D$ is a linear subspace of $TM$, because $TM$ is not a vector space.
The fact that it's a subbundle says everything that needs to be said -- by definition, this means that it's a subset of $TM$ that is an embedded submanifold, such that each fiber $\mathcal D_p$ is a linear subspace of $T_pM$, and such that $\mathcal D$ is a vector bundle in its own right when each fiber $\mathcal D_p$ is given the vector space structure that it inherits from $T_pM$. Implicitly, this also implies a few other things, such as that each fiber $\mathcal D_p$ is nonempty, and the fibers are all the same dimension.
I was also told to think of the distribution as
$\mathcal{D}=\text{span}\{X_1,\cdots,X_r\}$ where $X_1,\cdots,X_r$ are
linearly independent vector fields in the tangent space of $M$.
This is not quite true, but it is true locally -- Each point of $M$ has a neighborhood $U$ such that $TU\cap \mathcal D$ is locally the span of an $r$-tuple of linearly independent vector fields. More precisely, what this means is that there are $r$ vector fields $X_1,\dots,X_r$ on $U$ such that for each $p\in U$,
$$
\mathcal D_p = \operatorname{span}(X_1|_p,\dots,X_r|_p).
$$
The main thing wrong with your statement is that it might not be possible to find such vector fields globally on $M$.
We also distinguished a distribution from a smooth distribution by
adding the requirement that vectors must vary smoothly from point to
point.
"The requirement that vectors must vary smoothly from point to point" is a little too vague to be useful. What vectors? Here's a more precise statement of what it means for $\mathcal D$ to be a smooth distribution:
- Each point of $M$ has a neighborhood $U$ on which there are smooth independent vector fields $X_1,\dots,X_r$ that span $\mathcal D\cap TU$ in the sense described above.
Given two vector fields $X,Y \in \mathcal{D}$ with
$X_p,Y_p\in \mathcal{D}_p$ if $[X,Y]\in \mathcal{D}$ then there exists an
integral manifold which is a sub-manifold of the distribution; that is $\mathcal{D}$ is integrable. We also say that the distribution
$\mathcal{D}$ is involutive.
You've got some misplaced quantifiers here. It should say
- We say $\mathcal D$ is involutive if it has the following property: for all smooth vector fields $X,Y$ on $M$ that satisfy $X_p,Y_p\in \mathcal D_p$ for each $p\in M$, we also have $[X,Y]_p\in \mathcal D_p$ for each $p\in M$.
- If $\mathcal D$ is involutive, then it has the property that through every point of $M$ there is a integral manifold of $\mathcal D$, which is an $r$-dimensional submanifold $S\subseteq M$ whose tangent space at each point $q\in S$ is equal to $\mathcal D_q$. (This is called the Frobenius theorem.)
(It's not correct to say that an integral manifold is a "submanifold of the distribution," since the distribution is a subset of $TM$ while an integral manifold is a subset of $M$.)