I can see that using the law of large numbers and perhaps mild conditions on the likelihood function, one can show the empirical Fisher information matrix uniformly converges to the true Fisher information matrix. I would like to know further how fast this uniform convergence is in terms of the number of observations?
It seems that there hasn't been much study on this problem and the literature is scarce. I tried to apply some of the results from Statistical Learning Theory, such as the analyses based on Rademacher Complexity, but those arguments seem to apply only to obtain large deviation bounds for the empirical process (i.e, the average negative log-likelihood function) and cannot be extended to its Hessian (i.e., the empirical Fisher information matrix).
I am not merely looking for asymptotic behavior which can be described by CLT, rather I would like to have large deviation-type inequalities for the Fisher information matrix. In particular I would like to know if and how fast the probability that the deviation between the spectrum of the empirical information matrix and that of the true information matrix exceeds some $\epsilon>0$ goes to zero as a function of the sample size. I'm interested in concentration inequality types of results.
EDIT: The paragraph above was added after the answer posted by -did, because I didn't have permission to add comments at that time and I wasn't aware of the editing rules. I accepted the answer by -did, but I think it addresses my problem only partially.
