0
$\begingroup$

Suppose we are given a set of points $(x_i, y_i)\in\mathbb{R}^2$ and are told that they are drawn from a normal (Gaussian) distribution. It is a simple matter in that case to find the mean $(\mu_x,\mu_y)$ of the distribution: $(\langle x \rangle, \langle y \rangle)$, where $\langle x \rangle=\frac{1}{N}\sum_i x_i$ and similarly for $\langle y \rangle$, is an unbiased estimator.

Now suppose, instead, that each point is randomly presented as either $(x_i,y_i)$ or $(y_i, x_i)$, with equal probability. Is there still a comparably simple way to estimate the mean of the original distribution? (Assume $\mu_x \le \mu_y$.) This is equivalent to the case where the points are drawn from a mixture of two Gaussians, where one is constrained to be the reflection of the other across the line $y=x$. However, one might hope that the symmetry of the problem leads to some simplification from the general case of two Gaussians.

Note, by the way, that taking $\mu_x=\langle \min(x, y)\rangle$ and $\mu_y = \langle \max (x,y) \rangle$ does not work: this is adequate only when the individual Gaussian distributions are well-separated by the line $y=x$. Otherwise, it introduces a systematic bias away from that line.

  • 0
    I know a way but it is not as simple as the sample mean. Actually there are two ways.2012-08-17
  • 0
    Does EM algorithm apply to this situation?2012-08-17
  • 0
    @Tunococ: I'd be happy with an expectation-maximizing solution, provided it's explicit. But any kind of iterative search for an EM solution isn't really any simpler than the general case of two arbitrary Gaussians.2012-08-17

3 Answers 3

0

EM does apply to this situation. It is of course much complexer than the sample mean estimator. Other iterative algoritms as well. The second way is to make the estimation in a well seperated regions. I also think about using the sample mean estimator and another, a bit complexer estimator, for the bias term. Then The overall estimator will be simply the difference between these two estimators. I think if the variance for $x$ and $y$ is known, then one can seperate the estimation to be made into two distinct regions. Another way came to my mind is to use EM partially just to get some idea about the regions. Then to use the sample mean estimators only on these regions.

I hope it helps..

1

The problem can be solved by EM. Let the first Gaussian have mean $\mu_1 = [\mu_x;\mu_y]$ and covariance matrix $\Sigma_1 = \left( \begin{array}{cc} \sigma_{xx}^2 & \sigma_{xy} \\ \sigma_{yx} & \sigma_{yy}^2 \\ \end{array} \right)$ where $\sigma_{xx}^2$ is the variance of $x$, $\sigma_{yy}^2$ is the variance of $y$ and $\sigma_{xy}$ is the covariance between $x$ and $y$. It is easy to see that the second Gaussian will have mean $\mu_2 = [\mu_y;\mu_x]$ and covariance $\Sigma_2 = \left( \begin{array}{cc} \sigma_{yy}^2 & \sigma_{xy} \\ \sigma_{yx} & \sigma_{xx}^2 \\ \end{array} \right)$.

If you define : $R = \left(\begin{array}{cc} 0 & 1 \\ 1 & 0 \\ \end{array} \right)$ , we can write $\mu_2 = R \mu_1$ and $\Sigma_2 = R\Sigma_1R$. The mixture weights are fixed at $0.5$ each since you mentioned that the dimensions are flipped with equal probability. So you only have to estimate $\mu_1$ and $\Sigma_1$, which is a simplification from the generic case of a mixture model with two Gaussians. An EM algorithm can now be formulated for this problem using the standard recipe.

1

By moments:

Let $Z=[Z_1,Z_2]$ be the mixed variable, let $m_k$ be the $k$-th centered moment of the first components (both components are marginally equivalent, actually), $m_k = E[(Z_1 - m_1)^k]$

Let $\alpha = \frac{1}{2}(\mu_x+ \mu_y)$, $\delta=\frac{1}{2}(\mu_y -\mu_x)$, $s=\frac{1}{2}(\sigma_x^2 + \sigma^2_y)$, $d=\frac{1}{2}(\sigma_y^2 - \sigma^2_x)$

Then, we get the following nonlinear system of equations:

$$\begin{eqnarray} m_1 &=& \alpha\\ m_2 &=& s + \delta^2\\ m_3 &=& 3 \delta d\\ m_4 &=& 3 s^2 + 3 d^2 + 6 s \delta^2 + \delta^4 \end{eqnarray} $$

This can be solved numerically or analytically (Maxima...)

$$ \delta= \frac{\sqrt{{\left( \sqrt{2\,m_{4x}^3+3\,m_3^4}+\sqrt{3}\,m_3^2\right) }^{\frac{2}{3}}-{2}^{\frac{1}{3}}\,m_{4x}}}{{2}^{\frac{1}{3}}\,{3}^{\frac{1}{4}}\,{\left( \sqrt{2\,m_{4x}^3+3\,m_3^4}+\sqrt{3}\,m_3^2\right) }^{\frac{1}{6}}} $$

where $m_{4x} = m_4 - 3 m_2^2$

This can be simplified if both variables have same variance ($d=0$).

$$\delta=\left(\frac{|m_{4x}|}{{2}} \right)^{\frac{1}{4}}$$

Of course, $m_k$ is estimated by the standard estimators (though high order moments might require many samples to get good estimations) and then, computing $\alpha$ and $\delta$ we have the estimated $\mu_x$,$\mu_y$.

I've done some basic testing with Octave/Matlab, code attached

mu = [3,5]'  % (mean vector)
cov = [ 1 , 1.5 ; 1.5,4]  % covariance matrix
rho = cov(1,2)/sqrt(cov(1,1)*cov(2,2))  % compute correlation factor
N=2000;               % number of samples
X=mvnrnd(mu,cov,N);   % unmixed samples
W=unidrnd(2,N,1)-1;   % mixing flag
Y =  flipdim(X,2);    % reversed
Z=X;                  %
Z(W>0,:)=Y(W>0,:);    % mixed
delta=(mu(2)-mu(1))/2 % "true" values
alpha=(mu(2)+mu(1))/2
s=(cov(1,1)+cov(2,2))/2
d=(cov(2,2)-cov(1,1))/2

m1=mean(mean(Z));    % estimated moments
m2=mean(mean( (Z-m1).^2));
m3=mean(mean( (Z-m1).^3));
m4=mean(mean( (Z-m1).^4));
m4x = m4 - 3*m2^2;

alpha             
m1
s+delta^2
m2
3*delta*d
m3
sum([3*(s^2+d^2),6*s*delta^2,delta^4])
m4

alphaest= m1;
deltaest = deltaest = sqrt((sqrt(2*m4x^3+3*m3^4)+sqrt(3)*m3^2)^(2/3)-2^(1/3)*m4x)/(2^(1/3)*3^(1/4)*(sqrt(2*m4x^3+3*m3^4)+sqrt(3)*m3^2)^(1/6))

mu(2)             % compare true and estimated means
alphaest+deltaest
mu(1)
alphaest-deltaest