1
$\begingroup$

If $f : \mathbb{R}^{n} \longrightarrow \mathbb{R}$ is $L$-Lipschitz (w.r.t. the $\|\cdot\|_2$ norm), it is a fact that if $x \sim N(0, I)$, then $$ \mathbb{P}( f(x) - \mathbb{E}f(x) \geq t) \leq e^{-t^2/(2L^2)} \:. $$

I'm reading the proof (from Theorem 4.3 of https://galton.uchicago.edu/~lalley/Courses/386/Concentration.pdf) of this fact using Gaussian log Sobolev inequalities, and I'm a bit confused as to the important of $\mathbb{E} f(x)$ in the proof. Specifically, I'm wondering why the same proof does not imply the (obviously incorrect fact) that for any $M \in \mathbb{R}$, $$ \mathbb{P}( f(x) - M \geq t) \leq e^{-t^2/(2L^2)} \:. \:\:\: (*) $$

I'm going to post my logic here, and hopefully somebody can point out the flaw in the argument.

Fix an $M \in \mathbb{R}$ and an $f$ which is $L$-Lipschitz and differentiable and define $g(x) := f(x) - M$. Since $f$ is $L$-Lipschitz, then so is $g$. Suppose we want to show $(*)$. By a standard Chernoff argument, it suffices to show that there exists finite positive constants $C, A$ such that the MGF $\mathbb{E} e^{\lambda g} \leq C e^{A \lambda^2}$ for all $\lambda > 0$ (c.f. Lemma 2.3 of Lalley's notes).

I will now repeat the argument of Theorem 4.3. Define $h = e^{\lambda g/2}$. By the Gaussian log Sobolev inequality, $$ \mathrm{Ent}(h^2) \leq 2 \mathbb{E} \|\nabla h \|^2_2 = 2 \mathbb{E}\| \nabla e^{\lambda g/2} \|_2^2 = \frac{\lambda^2}{2} \mathbb{E} e^{\lambda g} \| \nabla g \|_2^2 \leq \frac{\lambda^2 L^2}{2} \mathbb{E} e^{\lambda g} \:. $$

Now, define the function $F(\lambda) := \mathbb{E} e^{\lambda g}$. We have that $F'(\lambda) = \mathbb{E} g e^{\lambda g}$. Hence, $$ \mathrm{Ent}(h^2) = \mathbb{E} \lambda g e^{\lambda g} - (\mathbb{E} e^{\lambda g}) \log(\mathbb{E} e^{\lambda g}) = \lambda F'(\lambda) - F(\lambda) \log(F(\lambda)) \leq \frac{\lambda^2 L^2}{2} F(\lambda) \:. $$ But this inequality is the same differential inequality which shows up in the standard proof, from which we conclude that $F(\lambda) \leq e^{L^2 \lambda^2 / 8}$. Notice how $M$ does not enter into these constants.

But then we have just constructed a bound on the MGF for $g(x)$, and can apply the Chernoff argument, yielding $(*)$. Where does this argument break down?

1 Answers 1

0

The value of $M$ changes the value of $F'(0)=E[g(X)]$. The value of $F'(0)$ is involved when solving the differential inequality.

Indeed, solving the differential inequality \[ \lambda F'(\lambda) - \log F(\lambda) \le \lambda^2 L^2 F(\lambda)/2 \] can be done as follows (it is taken from page 122 of Concentration Inequalities: A Nonasymptotic Theory of Independence by Gábor Lugosi, Pascal Massart, and Stéphane Boucheron).

Set $H(\lambda) = (1/\lambda) \log F(\lambda)$. Then \[ H'(\lambda) = \frac{F'(\lambda)}{\lambda F(\lambda)} - \frac{\log F(\lambda)}{\lambda^2} \le L^2/2. \] To integrate this, we need the initial condition, this is where the expectation, and the value $M$ in your notation, kicks in. \[ H(\lambda) = \frac{\log F(\lambda)}{\lambda} = \frac{\lambda F'(0)/F(0) + o(\lambda)}{\lambda} \to_{\lambda\to 0} \frac{F'(0)}{F(0)} = E[g(X)]. \] By integration, $(1/\lambda) \log F(\lambda) = H(\lambda) \le H(0) + \lambda L^2/2 = E[g(X)] + \lambda L^2/2$, hence $F(\lambda) \le \exp(\lambda E[g(X)] + \lambda^2L^2/2)$.