1
$\begingroup$

How one can prove that $ \int_{\mathbb{R}^N} \mathcal{N}(\mathbf{y}| \boldsymbol{\mu}_1, K_1) \log \mathcal{N}(\mathbf{y}| \boldsymbol{\mu}_0, K_0) d \mathbf{y} = -\frac12 \left[ N \log 2 \pi + \log |K_0| + \mathrm{tr} (K_0^{-1} K_1) + (\boldsymbol{\mu}_0 - \boldsymbol{\mu}_1)^T K_0^{-1} (\boldsymbol{\mu}_0 - \boldsymbol{\mu}_1) \right]. $ There $\mathbf{y} \in \mathbb{R}^N$, $|M|$ is the determinant of matrix $M$, $\mathrm{tr}(M)$ is the trace of matrix $K$, $ \mathcal{N}(\mathbf{y}| \boldsymbol{\mu}, K) = \frac{1}{(2 \pi)^{\frac{N}{2}} |K|^{\frac12}} \exp \left(-\frac12 (\mathbf{y} - \boldsymbol{\mu})^T K^{-1} (\mathbf{y} - \boldsymbol{\mu})\right). $

For a one-dimensional case the problem is simple, but for higher dimensions I can't directly prove this formula, but it seems to be correct (for example, it can be used to calculate Kullback-Leibler divergence between two gaussian distributions).

Also, it would be sufficient in case one provides a source with the equation mentioned above.

1 Answers 1

0

The only problem with this integral is the term $ \int_{\mathbb{R}^N} (\mathbf{y} - \boldsymbol{\mu_1})^T K_1^{-1} (\mathbf{y} - \boldsymbol{\mu_1}) \frac{1}{(2 \pi)^{\frac{N}{2}} |K_0|^{\frac12}} \exp\left(-\frac12 (\mathbf{y} - \boldsymbol{\mu}_0)^T K_0^{-1} (\mathbf{y} - \boldsymbol{\mu}_0)) \right) d\mathbf{y}. $ Let's make substitution $\mathbf{x} = \mathbf{y} - \boldsymbol{\mu_0}$. We get $ \int_{\mathbb{R}^N} (\mathbf{x} - \boldsymbol{\mu_1} + \boldsymbol{\mu_0})^T K_1^{-1} (\mathbf{x} - \boldsymbol{\mu_1} + \boldsymbol{\mu_0}) \frac{1}{(2 \pi)^{\frac{N}{2}} |K_0|^{\frac12}} \exp\left(-\frac12 \mathbf{x}^T K_0^{-1} \mathbf{x} \right) d\mathbf{x}. $ Denote $\boldsymbol{\mu} = \boldsymbol{\mu_1} - \boldsymbol{\mu_0}$. Note that $ (\mathbf{x} - \boldsymbol{\mu})^T K_1^{-1} (\mathbf{x} - \boldsymbol{\mu}) = \mathbf{x} K_1^{-1} \mathbf{x} - 2 \boldsymbol{\mu}^T K_1^{-1} \mathbf{x} + \boldsymbol{\mu}^T K_1^{-1} \boldsymbol{\mu}. $

The integral $ \int_{\mathbb{R}^N} - 2\boldsymbol{\mu}^T K_1^{-1} \mathbf{x} \frac{1}{(2 \pi)^{\frac{N}{2}} |K_0|^{\frac12}} \exp\left(-\frac12 \mathbf{x}^T K_0^{-1} \mathbf{x} \right) d\mathbf{x} = 0. $

Now we need to calculate $ \int_{\mathbb{R}^N} \mathbf{x}^T K_1^{-1} \mathbf{x} \frac{1}{(2 \pi)^{\frac{N}{2}} |K_0|^{\frac12}} \exp\left(-\frac12 \mathbf{x}^T K_0^{-1} \mathbf{x} \right) d\mathbf{x}. $

Matrix $K_1 = U^T U$, $K_1^{-1} = U^{-1} U^{-T}$ ($K_1 > 0$ and symmetric for distribution to be correct). So, we make another substitution $ \mathbf{z} = U^{-T} \mathbf{x}, \\ \mathbf{x} = U^T \mathbf{z}, \\ d \mathbf{x} = |U|^T d \mathbf{z}. $ The integral for this substitution has the form: $ \int_{\mathbb{R}^N} \mathbf{z}^T \mathbf{z} \frac{1}{(2 \pi)^{\frac{N}{2}} |K_0|^{\frac12}} \exp\left(-\frac12 \mathbf{z}^T U K_0^{-1} U^T \mathbf{z} \right) |U|^T d\mathbf{z} = \\ \int_{\mathbb{R}^N} \mathbf{z}^T \mathbf{z} \frac{1}{(2 \pi)^{\frac{N}{2}} |U^{-T} K_0 U^{-1}|^{\frac12}} \exp\left(-\frac12 \mathbf{z}^T (U^{-T} K_0 U^{-1})^{-1} \mathbf{z} \right) d\mathbf{z} = \\ \int_{\mathbb{R}^N} \sum_{i = 1}^N z_i^2 \frac{1}{(2 \pi)^{\frac{N}{2}} |U^{-T} K_0 U^{-1}|^{\frac12}} \exp\left(-\frac12 \mathbf{z}^T (U^{-T} K_0 U^{-1})^{-1} \mathbf{z} \right) d\mathbf{z} = \\ =\mathrm{tr} (U^{-T} K_0 U^{-1}) = \mathrm{tr} (U^{-1} U^{-T} K_0) = \mathrm{tr} (K_1^{-1} K_0). $

Now we can prove: $ \int_{\mathbb{R}^N} (\mathbf{y} - \boldsymbol{\mu_1})^T K_1^{-1} (\mathbf{y} - \boldsymbol{\mu_1}) \frac{1}{(2 \pi)^{\frac{N}{2}} |K_0|^{\frac12}} \exp\left(-\frac12 (\mathbf{y} - \boldsymbol{\mu}_0)^T K_0^{-1} (\mathbf{y} - \boldsymbol{\mu}_0)) \right) d\mathbf{y} = \mathrm{tr} (K_1^{-1} K_0) + (\boldsymbol{\mu}_1 - \boldsymbol{\mu}_0)^T K_1^{- 1} (\boldsymbol{\mu}_1 - \boldsymbol{\mu}_0). $ And finally it is easy to prove the result in question.