1
$\begingroup$

I was hoping someone could help me with a brief statement I can't understand in a book.

The problem I have is with the final line of the following section of Lemma 2.2 (on the second page):

Since $|\mathcal{T}_j|$ is bin. distributed with expectation $n(\log{n})^2 2^{−\sqrt{\log{n}}}$, by the standard estimates, we have that the probability that $\mathcal{T}_j$ has more than $2\mu$ elements is at most $e^{−μ/3} < n^{−2}$. Then, with probability at least $1− \frac{\log{n}}{n^2}$, the sum of the sizes of these sets is at most $n(\log{n})^3 2^{-\sqrt{\log{n}}}$.

Why is this?

1 Answers 1

1

Wikipedia is your friend. In general, when a paper mentions using technique X, if you are not aware of technique X, then look it up. It will be impossible to fill the gap without knowing about X.

In the case at hand, X is the Chernoff bound (also Hoeffding's inequality, and even more names). It's indeed pretty standard, so it's good for you to get to know it. It's a theorem of the "concentration of measure" type, saying that if you average many well-behaved random variables, then you get something which is concentrated roughly like it should according to the central limit theorem. The central limit theorem itself doesn't give you anything about the speed of convergence to a normal distribution, and so to get an actual "large deviation bound" you use something like Chernoff. Sometimes you need more refined results, and then you use Berry-Esséen (q.v.).

  • 0
    There are versions of the central limit theorem that do give bounds on how close the average is to the normal distribution. The first one I ever saw was in Uspensky's book on probability and is was really complicated. If you do a Google search for "central limit theorem with explicit bounds", you will find many useful links.2012-04-03