I will refer you to this great informal introduction which gives a few finite examples and explains the intuition behind the definitions.
However $H_\infty^\varepsilon$ is not mentioned. The effect of smoothing is similar to $H_0^\varepsilon$, but in reverse: the smoothing will continuously "steal" probability from the most probable states until $\varepsilon$ probability has been stolen. If you have $n$ maximum-probability states and all other states are somewhat less probable, then the maximum probability will be reduced by $\varepsilon/n$. So a single small peak of probability will be smoothed off, but the effect of smoothing will be negligible in two notable cases: if you have many maximum-probability (or close to maximum-probability) states, or if the maximum probability is much greater than $\varepsilon$.
Unlike $H_0^\varepsilon$ which can be radically different from $H_0$, $H_\infty^\varepsilon$ is continuous in $\varepsilon$ in the following sense: $0<\exp(-H_\infty(X))-\exp(-H_\infty^\varepsilon(X))\le \varepsilon$ (more generally, $\exp(-H_\infty^\varepsilon(X))$ is an increasing 1-Lipschitz function of $\varepsilon$).
Finally, for a fair coin: $H_\infty^\varepsilon(X)=-\log \frac{1-\varepsilon}{2}$ $H_0^\varepsilon(X)=\begin{cases}\log 2&\varepsilon<1/2\\0&\varepsilon\ge 1/2\end{cases}$ $H_\infty^\varepsilon$ steals $\varepsilon/2$ from each state since they have both maximal probability. $H_0^\varepsilon$ tries to steal from one minimum-probability state to reduce the number of states: this only succeeds if $\varepsilon\ge 1/2$.