1
$\begingroup$

The Kullback-Liebler divergence between two distributions with pdfs $f(x)$ and $g(x)$ is defined by $\mathrm{KL}(F;G) = \int_{-\infty}^{\infty} \ln \left(\frac{f(x)}{g(x)}\right)f(x)\,dx$

Compute the Kullback-Lieber divergence when $F$ is the standard normal distribution and $G$ is the normal distribution with mean $\mu$ and variance $1$. For what value of $\mu$ is the divergence minimized?

I was never instructed on this kind of divergence so I am a bit lost on how to solve this kind of integral. I get that I can simplify my two normal equations in the natural log but my guess is that I should wait until after I take the integral. Any help is appreciated.

  • 2
    *Leibler*. // For what value of *what* is the divergence minimized?2012-12-13

3 Answers 3

0

Ok, I've been searching for ages for the Kullback-Liebler divergence between two Normal distributions and didn't find it, but RS's answer enabled me to calculate it quite simply. Here's my derivation. I've also derived it for Beta distributions.

Kullback-Liebler divergence of Normal Distributions

Suppose we have two Normal distributions $F\sim N(\mu_{f},\sigma_{f})$; $G\sim N(\mu_{g},\sigma_{g})$. The Kullback-Liebler divergence is defined as:

$ \mathrm{KL}(F||G)=\int f(x)\ln\left(\frac{f(x)}{g(x)}\right)dx\quad\mathrm{nats} $

Divide by $\ln(2)$ to get the answer in bits. The Gaussian PDF is:

$ f(x)=\frac{1}{\sigma_{f}\sqrt{2\pi}}\, e^{\dfrac{-(x-\mu_{f})^{2}}{2\sigma_{f}^{2}}} $

Substituting we get

\begin{eqnarray*} \mathrm{KL}(F||G) & = & \int f(x)\ln\left(\frac{e^{\frac{-(x-\mu_{f})^{2}}{2\sigma_{f}^{2}}}}{\sigma_{f}\sqrt{2\pi}}\frac{\sigma_{g}\sqrt{2\pi}}{e^{\frac{-(x-\mu_{g})^{2}}{2\sigma_{g}^{2}}}}\right)dx\\ & = & \int f(x)\ln\left(\frac{e^{\frac{-(x-\mu_{f})^{2}}{2\sigma_{f}^{2}}}}{\sigma_{f}}\frac{\sigma_{g}}{e^{\frac{-(x-\mu_{g})^{2}}{2\sigma_{g}^{2}}}}\right)dx\\ & = & \int f(x)\left[\ln\left(\frac{e^{\frac{-(x-\mu_{f})^{2}}{2\sigma_{f}^{2}}}}{e^{\frac{-(x-\mu_{g})^{2}}{2\sigma_{g}^{2}}}}\right)+\ln\left(\frac{\sigma_{g}}{\sigma_{f}}\right)\right]dx\\ & = & \int f(x)\left[\frac{-(x-\mu_{f})^{2}}{2\sigma_{f}^{2}}-\frac{-(x-\mu_{g})^{2}}{2\sigma_{g}^{2}}+\ln\left(\frac{\sigma_{g}}{\sigma_{f}}\right)\right]dx \end{eqnarray*}

Then via a tedious and error-prone but straightforward expansion we get

$ =\left[\ln\left(\frac{\sigma_{g}}{\sigma_{f}}\right)+\frac{-\mu_{f}^{2}}{2\sigma_{f}^{2}}-\frac{-\mu_{g}^{2}}{2\sigma_{g}^{2}}\right]\int f(x)dx+\left[\frac{2\mu_{f}}{2\sigma_{f}^{2}}-\frac{2\mu_{g}}{2\sigma_{g}^{2}}\right]\int x\, f(x)dx+\left[\frac{-1}{2\sigma_{f}^{2}}-\frac{-1}{2\sigma_{g}^{2}}\right]\int x^{2}f(x)dx $

Then we have the following properties:

\begin{eqnarray*} \int f(x)dx & = & 1\\ \int x\, f(x)dx & = & \mu_{f}\\ \int x^{2}f(x)dx & = & \mu_{f}^{2}+\sigma_{f}^{2} \end{eqnarray*}

Which gives:

\begin{eqnarray*} \mathrm{KL}(F||G) & = & \left[\ln\left(\frac{\sigma_{g}}{\sigma_{f}}\right)+\frac{-\mu_{f}^{2}}{2\sigma_{f}^{2}}-\frac{-\mu_{g}^{2}}{2\sigma_{g}^{2}}\right]+\left[\frac{2\mu_{f}}{2\sigma_{f}^{2}}-\frac{2\mu_{g}}{2\sigma_{g}^{2}}\right]\mu_{f}+\left[\frac{-1}{2\sigma_{f}^{2}}-\frac{-1}{2\sigma_{g}^{2}}\right]\left(\mu_{f}^{2}+\sigma_{f}^{2}\right)\\ & = & \ln\left(\frac{\sigma_{g}}{\sigma_{f}}\right)+\frac{-\mu_{f}^{2}}{2\sigma_{f}^{2}}+\frac{\mu_{g}^{2}}{2\sigma_{g}^{2}}+\frac{2\mu_{f}^{2}}{2\sigma_{f}^{2}}+\frac{-2\mu_{g}\mu_{f}}{2\sigma_{g}^{2}}+\frac{-\mu_{f}^{2}-\sigma_{f}^{2}}{2\sigma_{f}^{2}}+\frac{\mu_{f}^{2}+\sigma_{f}^{2}}{2\sigma_{g}^{2}}\\ & = & \ln\left(\frac{\sigma_{g}}{\sigma_{f}}\right)+\frac{\mu_{g}^{2}-2\mu_{g}\mu_{f}+\mu_{f}^{2}+\sigma_{f}^{2}}{2\sigma_{g}^{2}}+\frac{-\mu_{f}^{2}+2\mu_{f}^{2}-\mu_{f}^{2}-\sigma_{f}^{2}}{2\sigma_{f}^{2}}\\ & = & \ln\left(\frac{\sigma_{g}}{\sigma_{f}}\right)+\frac{(\mu_{f}-\mu_{g})^{2}+\sigma_{f}^{2}-\sigma_{g}^{2}}{2\sigma_{g}^{2}} \end{eqnarray*}

I verified this numerically in Matlab, after fixing many sign errors! If anyone wants to do this for the Beta distribution I'd be greatful!

Kullback-Liebler divergence of Beta Distributions

Suppose we have two Beta distributions $F\sim\mathrm{Beta}(\alpha_{f},\beta_{f})$; $G\sim\mathrm{Beta}(\alpha_{g},\beta_{g})$. The Kullback-Liebler divergence is defined as:

$\mathrm{KL}(F||G)=\int f(x)\ln\left(\frac{f(x)}{g(x)}\right)dx\quad\mathrm{nats}$

Divide by $\ln(2)$ to get the answer in bits. The Beta PDF is:

$f(x)=\frac{\Gamma(\alpha_{f}+\beta_{f})}{\Gamma(\alpha_{f})\Gamma(\beta_{f})}x^{\alpha_{f}-1}(1-x)^{\beta_{f}-1}$

Where $\Gamma()$ is the Gamma function. Substituting gives:

\begin{eqnarray*} \mathrm{KL}(F||G) & = & \int_{0}^{1}f(x)\ln\left(\frac{\frac{\Gamma(\alpha_{f}+\beta_{f})}{\Gamma(\alpha_{f})\Gamma(\beta_{f})}x^{\alpha_{f}-1}(1-x)^{\beta_{f}-1}}{\frac{\Gamma(\alpha_{g}+\beta_{g})}{\Gamma(\alpha_{g})\Gamma(\beta_{g})}x^{\alpha_{g}-1}(1-x)^{\beta_{g}-1}}\right)dx\\ & = & \int_{0}^{1}f(x)\ln\left(\frac{\frac{\Gamma(\alpha_{f}+\beta_{f})}{\Gamma(\alpha_{f})\Gamma(\beta_{f})}x^{\alpha_{f}-\alpha_{g}}(1-x)^{\beta_{f}-\beta_{g}}}{\frac{\Gamma(\alpha_{g}+\beta_{g})}{\Gamma(\alpha_{g})\Gamma(\beta_{g})}}\right)dx\\ & = & \int_{0}^{1}f(x)\ln\left(\frac{\Gamma(\alpha_{f}+\beta_{f})\Gamma(\alpha_{g})\Gamma(\beta_{g})}{\Gamma(\alpha_{g}+\beta_{g})\Gamma(\alpha_{f})\Gamma(\beta_{f})}x^{\alpha_{f}-\alpha_{g}}(1-x)^{\beta_{f}-\beta_{g}}\right)dx\\ & = & \int_{0}^{1}f(x)\left[\ln\frac{\Gamma(\alpha_{f}+\beta_{f})\Gamma(\alpha_{g})\Gamma(\beta_{g})}{\Gamma(\alpha_{g}+\beta_{g})\Gamma(\alpha_{f})\Gamma(\beta_{f})}+\ln\left(x^{\alpha_{f}-\alpha_{g}}\right)+\ln\left((1-x)^{\beta_{f}-\beta_{g}}\right)\right]dx\\ & = & \ln\frac{\Gamma(\alpha_{f}+\beta_{f})\Gamma(\alpha_{g})\Gamma(\beta_{g})}{\Gamma(\alpha_{g}+\beta_{g})\Gamma(\alpha_{f})\Gamma(\beta_{f})}+(\alpha_{f}-\alpha_{g})\int_{0}^{1}f(x)\ln x\, dx+(\beta_{f}-\beta_{g})\int_{0}^{1}f(x)\ln(1-x)dx \end{eqnarray*}

In terms of expectations this is:

$\ln\frac{\Gamma(\alpha_{f}+\beta_{f})\Gamma(\alpha_{g})\Gamma(\beta_{g})}{\Gamma(\alpha_{g}+\beta_{g})\Gamma(\alpha_{f})\Gamma(\beta_{f})}+(\alpha_{f}-\alpha_{g})\mathrm{E}\left(\ln F\right)+(\beta_{f}-\beta_{g})\mathrm{E}\left(\ln(1-F)\right)$

From Wikipedia we have:

$\mathrm{E}(\ln F)=\psi(\alpha_{f})-\psi(\alpha_{f}+\beta_{f})$

Where $\psi(x)=\frac{d}{dx}\ln\Gamma(x)=\frac{\Gamma'(x)}{\Gamma(x)}$ is the digamma function (also known as the polygamma function; it is \texttt{psi} in Matlab). By swapping variables it is easy to show that

$\mathrm{E}(\ln(1-F))=\psi(\beta_{f})-\psi(\alpha_{f}+\beta_{f})$

Therefore the final solution is

$\mathrm{KL}(F||G)=\ln\frac{\Gamma(\alpha_{f}+\beta_{f})\Gamma(\alpha_{g})\Gamma(\beta_{g})}{\Gamma(\alpha_{g}+\beta_{g})\Gamma(\alpha_{f})\Gamma(\beta_{f})}+(\alpha_{f}-\alpha_{g})\left(\psi(\alpha_{f})-\psi(\alpha_{f}+\beta_{f})\right)+(\beta_{f}-\beta_{g})\left(\psi(\beta_{f})-\psi(\alpha_{f}+\beta_{f})\right)$

3

I cannot comment (not enough reputation).

Vincent: You have the wrong pdf for $g(x)$, you have a normal distribution with mean 1 and variance 1, not mean $\mu$.

Hint: You don't need to solve any integrals. You should be able to write this as pdf's and their expected values, so you never need to integrate.

Outline: Firstly, $ \log({f(x) \over g(x) })=\left\{ -{1 \over 2} \left( x^2 - (x-\mu )^2 \right) \right\} $ . Expand and simplify. Don't even write out the other $f(x)$ and see where that takes you.

  • 0
    Thanks, can't believe I didn't see that!2014-11-12
0

The pdf of the standard normal distribution is $ \frac{1}{\sqrt{2\pi}}e^{-x^{2}/2}$. Similarly, $g(x) = \frac{1}{\sqrt{2\pi}}e^{(x-\mu)^2/2}$.Therefore, $D_{KL}=\int_{-\infty}^{\infty} \frac{{2\mu-\mu}^2}{2\sqrt{2\pi}}e^{-x^{2}/2} dx$. This is equivalent to $\frac{{\mu}^2}{2}$.

This can be easily generalized to any two normal distributions with means $\mu_1, \mu_2$ and variances ${\sigma_1}^2, {\sigma_1}^2$.

The K-L divergence is obviously minimized at $\mu=0$ - where the distributions are the same!

EDIT: Sorry, I misread the mean as 1 originally - I have corrected it.

  • 0
    Yes, they cancel but you get another $2\pi$ from the $f(x)$. I misread your original question as mean and variance both as 1, sorry - I've just corrected it.2012-12-14