2
$\begingroup$

Assume I have a gaussian distribution $\mathcal{N}(\mu, C)$ with mean $\mu$ and covariance $C$. I'm drawing $n$ random numbers from this distribution. Let $m$ be the mean of these numbers. Is there some formula that gives the probability that the distance $d = ||\mu - m||$ is at least $x$, i.e. $P(d \ge x)$?

The background here is that in a recent simulation, the results seemed to cluster around a very slightly different point than expected, and I'd like to calculate the probability of this happening by chance.

  • 0
    "Random vectors" rather than "random numbers" is what I'd have expected you to say, given the context. I'm inclined to doubt there's a closed form for this.2012-06-08
  • 0
    Do you mean covariance, that is, is your distribution multivariate normal? Or is it univariate, like weights of randomly chosen people?2012-06-09
  • 0
    It's actually multivariate and random vectors, but the anomaly only occurs in one dimension and I thought the univariate case would be easier to answer, so I stated the question in terms of a univariate distribution. (as far as I know, covariance and variance are the same for univariate distributions)2012-06-09
  • 0
    Would the following make sense to you? It is the probability that $|Z|\ge \frac{x\sqrt{n}}{\sigma}$, where $Z$ is standard normal, and $\sigma$ is your standard deviation. The relevant probabilities for the standard normal are available in tables. Many pieces of software also do the calculation. If you have particular numbers I can walk you through the calculation.2012-06-09

1 Answers 1

2

$x=m-\mu$ follows a normal distribution $\mathcal N(0,C/n)$. I don't think there is a closed form for the probability that its norm is at least $d$, but I don't think this is the right statistical test either. You probably should transform your data so that $C$ becomes a unit matrix, and then $\sum_i x_i^2$ will follow a well-known $\chi^2$ distribution: you will easily find a $p$-value to test your hypothesis.

Edit: In the univariate case, this is of course much easier: $$P(\|x\|\le a)=\mathrm{erf} \frac{a}{\sqrt{2C/n}}$$