Question. Do you know a specific example which demonstrates that the tensor product of monoids (as defined below) is not associative?
Let $C$ be the category of algebraic structures of a fixed type, and let us denote by $|~|$ the underlying functor $C \to \mathsf{Set}$. For $M,N \in C$ we have a functor $\mathrm{BiHom}(M,N;-) : C \to \mathsf{Set}$ which sends an object $K \in C$ to the set of bihomomorphisms $M \times N \to K$, i.e. maps $|M| \times |N| \to |K|$ which are homomorphisms in each variable when the other one is fixed. Then one can show as usual that $\mathrm{BiHom}(M,N;-)$ is representable and call the universal bihomomorphism $M \times N \to M \otimes N$ the tensor product of $M,N$. This is a straight forward generalization of the well-known case $C=\mathsf{Mod}(R)$ for a commutative ring $R$.
Actually, this is a special case of a more general tensor product in concrete categories, studied in the paper "Tensor products and bimorphisms", Canad. Math. Bull. 19 (1976) 385-401, by B. Banaschewski and E. Nelson.
Here are some examples: For $C=\mathsf{Set}$, the tensor product equals the usual cartesian product. This is also true for $C=\mathsf{Set}_*$. For $C=\mathsf{Grp}$, we get $G \otimes H \cong G^{\mathsf{ab}} \otimes_{\mathbb{Z}} H^{\mathsf{ab}}$, using the Eckmann-Hilton argument. (This differs from the "tensor product of groups" studied in the literature). The case $C=\mathsf{CMon}$ is very similar to the well-known case $C=\mathsf{Ab}$ and is spelled out here; namely, we have internal homs and therefore a hom-tensor-adjunction. The same is true for $C=\mathsf{Mod}(\Lambda)$ for a commutative algebraic monad $\Lambda$, see here, Section 5.3.
Note that the tensor product is commutative, and that it commutes with filtered colimits in each variable. However, the case $C=\mathsf{Grp}$ shows that it does not have to commute with coproducts. In particular, tensoring with some object is no left adjoint. Also, the free object on one generator is not a unit in general:
Let us consider $C=\mathsf{Mon}$. Then, we have
$\mathbb{N} \otimes M = M / \{ (mn)^p = m^p n^p \}_{m,n \in M, p \in \mathbb{N}}$
The usual proof of the associativity of the tensor product breaks down: There is a map $\beta : M \times (N \otimes K) \to (M \otimes N) \otimes K$ mapping $(m, n \otimes k) \mapsto (m \otimes n) \otimes k$, which is a homomorphism in the second variable. But what about the first variable? The equation $\beta(mm',t) = \beta(m,t) \beta(m',t)$ is clear if $t \in N \otimes K$ is a pure tensor. But for $t=(n \otimes k) (n' \otimes k')$ we end up with the unlikely equation
$((m \otimes n) \otimes k) ((m' \otimes n) \otimes k) ((m \otimes n') \otimes k') ((m' \otimes n') \otimes k')$ $=((m \otimes n) \otimes k) ((m \otimes n') \otimes k') ((m' \otimes n) \otimes k) ((m' \otimes n') \otimes k')$