1
$\begingroup$

I am working with sampling of multivariate normally distributed numbers. I have a very fundamental question regarding the eigendecomposition of the $k \times k$ covariance matrix $\Sigma = \mathbf{U}\mathbf{\Lambda}\mathbf{U}^{-1} = \mathbf{U}\mathbf{\Lambda}\mathbf{U}^{T}$, where $U = [\mathbf{u}_1 \mathbf{u}_2 \dots \mathbf{u}_k]$ is an orthonormal (or at least orthogonal) matrix with eigenvectors and $\mathbf{\Lambda}$ is a diagonal matrix with the corresponding eigenvalues on the diagonal.

The eigendecomposition can also be further decomposed into $\Sigma = \mathbf{U}\mathbf{\Lambda}\mathbf{U}^{T} = ({\mathbf{U}\Lambda}^{1/2}){({\mathbf{U}\Lambda}^{1/2})}^{T}$.

But there is also the Cholesky factorization: $\mathbf{\Sigma} = \mathbf{L}{\mathbf{L}}^{T}$.

For real, symmetric, and positive-definite matrices $\mathbf{\Sigma}$ - what is the general relationship between $\mathbf{U}{\Lambda}^{1/2}$ and $\mathbf{L}$?

What I want to do, is to use $\mathbf{U}{\Lambda}^{1/2}$ (or $\mathbf{L}$, but I believe it might be wrong) to generate normally distributed numbers. Let's say that I have a $n \times k$ matrix $\mathbf{X}$ with $n$ points $\mathbf{x}_i \in \mathbb{R}^k$ as rows. These points are drawn from $N(\mathbf{0}, \mathbf{I})$ (done with a pseudo-random multivariate generator).

The following relationship exists: $X \sim N(\mathbf{\mu}, \mathbf{\Sigma}) \iff X \sim \mathbf{\mu}+\mathbf{U}{\Lambda}^{1/2}N(\mathbf{0}, \mathbf{I}) \iff X \sim \mathbf{\mu}+\mathbf{U}N(\mathbf{0}, \mathbf{\Lambda})$ for random variable $X$ taking its value from a multivariate normal distribution with mean $\mathbf{\mu}$ and covariance $\mathbf{\Sigma}$.

Using this knowledge I then apply $\mathbf{U}{\Lambda}^{1/2}$ as a scaling and rotation operator on $\mathbf{X}^T$ and my data are then distributed as $N(\mathbf{\mu}, \mathbf{\Sigma})$ (as far as I understand).

Can I use $\mathbf{L}$ here instead as the operator? And what does it mean if I do?

Thanks.

2 Answers 2

0

You can use either $L$ or $U \Lambda^{1/2}$ to generate you multivariate normal. They are equivalent, but have different interpretions. The $U \Lambda^{1/2}$ gives you the nice picture of a orthogonal rotation plus a scaling of the axis. The Cholesky $L$ factorization can be related to a Gram–Schmidt decomposition (more so if seen as the reverse transformation, from the arbitrary gaussian to the canonical one) or, if we regard the components as being indexed in time (like a stochastic process) it can be see as a "causal" filter, in the sense that each $x_i$ is obtained as a linear combination of the "previous" values from the canonical normal, say:

$x_1 = L_{1,1} z_1$

$x_2 = L_{1,2} z_1 + L_{2,2} z_2$

$\cdots$

etc

  • 0
    Level curves of the Gaussian... That is actually the concept that I am working with. These levels will exist at constant values of the Mahalanobis distance. The hyperellipsoid will have the following semiaxes: $c\sqrt{{\lambda}_i}\mathbf{e}_i$ where $c^2={\chi}_{\nu}^{2}(\alpha)$ (the inverse cdf of the Chi-square distribution). Finally I am now able to segment my images based in this distance measure - it works :)2011-07-08
1

Generally, $U\Lambda^{1/2}$ is the same as $LV$ for some orthogonal matrix $V$. Either of them can be used to generate the normal distribution with the given covariance matrix $\Sigma$ from independent standard normals.

  • 0
    Thanks for answer. It was mentioned in a comment in the other answer, that $L$ can be found in finite steps. That is very interesting. But what about $V$. Would $U$ work since it is orthogonal? Can $V$ (and/or $U$) also be found in finite steps?2011-07-08