5
$\begingroup$

given the mahalanobis distance:

$D_M^2(x) = (x-\mu)^T S^{-1}(x-\mu)$

how can I obtain the probability of $x = ( x_1, x_2, x_3, \dots, x_N )^T$ belonging to the data set given by covariance matrix $S$ and mean vector $\mu = ( \mu_1, \mu_2, \mu_3, \dots , \mu_N )^T$? If sample count is needed this is denoted $m$.

I would like something I can use in a computer algorithm.

Related to this I could ask how to obtain the hyper-ellipsoid that defines the confidence interval for e.g. 95%?

  • 0
    I don't understand the question. You need the probability that a certain point belongs to a certain data set? Also, are you assuming some distribution, like a multivariate normal distribution?2012-05-09
  • 1
    As for your second question, I believe the $\chi^2$-distribution will be helpful.2012-05-09
  • 0
    I am assuming a multivariate normal distribution, yes.2012-05-18
  • 0
    I think the question is very clear and I would also love to see a good answer. Let's say I have a GMM of some data and would like to be able to know if a random point x is represented in that model. Let's say I have the statistics for 20 gausian distributions that are mixed to represent my data. I can use Mahalanobis to have a good idea of the distance between my point and each of the 20 centroids. But I'd prefer to have the probability of that point belonging to my model distribution. ??2012-09-25
  • 1
    I agree the question is very clear and would also like a clear mathematical answer. Given that HarryMath is referencing Mahalanobis distance, it follows that he is using multivariate data with a Gaussian assumption. The Mahal distance is the number of std that a point is from the center of a cluster. Therefore the question is: given cov(cluster), and Mahal distance to a point, what is the probability that the point is in the cluster? I think the p(x=C) is simply 1-cdf(MahalD). Like to have it verified.2013-11-15
  • 0
    @JerryGregoire this is correct. With center of cluster being the mean $\mu$ of the cluster.2013-11-17

2 Answers 2