Let $X_1,X_2,\dots$ be i.i.d. samples drawn from a discrete space $\mathcal{X}$ according to probability distribution $P$, and denote the resulting empirical distribution based on n samples by $\hat{P}_n$. Also let $Q$ be an arbitrary distribution. It is clear that (KL-divergence)
$KL( \hat{P}_n || Q) \stackrel{n\rightarrow \infty}{\longrightarrow} KL(P || Q)$,
but I am wondering if there exist any known quantitative rate of convergence for it. I mean if it can be shown that
$\Pr\Big[ | KL( \hat{P}_n || Q) - KL(P || Q) | \geq \delta\Big] \leq f(\delta, n, |\mathcal{X}|)$,
and what is the best expression for the RHS if there is any.
Thanks a lot!