Suppose I have a discrete probability distribution $P$ on $n$ points ($n$ fixed), and I approximate it by sampling it $m$ times to get an estimate $P_m$ which just estimates the probability of getting point $i$ with the number of times we saw $i$ divided by $m$.
How fast can I guarantee with high probability $P_m \rightarrow P$ in $L^1$ and/or $L^\infty$? That is, if I want with high probability $\|P_m - P\| < \epsilon$ what $m$ do I need as a function of $\epsilon$?