1
$\begingroup$

Let $X$ be a chi squared variable with $121$ degrees of freedom. So the density $f_X$ of $X$ is defined by

$ f_X(x)=\frac{\big(\frac{x}{2}\big)^{\frac{121}{2}-1}}{\Gamma(\frac{121}{2})}{{e}^{-\frac{x}{2}}} $

I would like to compute $P(X>126)$ with an accuracy of $10^{-2}$. I know that a standard method is to approximate the distribution of $X$ by a normal distribution ( ${\cal N}(121,\sqrt{242})$ here), but I do not know of any control on the error made in this approximation. In theory this is just a problem of computing a definite integral with a good enough precision, but it seems to exceed the capacity of my formal calculator (indeed, the value $\Gamma(\frac{121}{2})$ is very large, its integer part has 81 digits.)

Is there a rigorous (and working!) method to solve this ?

1 Answers 1

2

You could use Cornish-Fisher asymptotic expansion formula to take into account higher moments, specifically the skewness of $\mathcal{s}$ and kurtosis $\mathcal{k}$: $ \mathcal{s}\left(\chi^2_{121}\right) = \frac{2\sqrt{2}}{11} \quad \mathcal{k}\left(\chi^2_{121}\right) = 3 + \frac{12}{121} $

The Cornish-Fisher expansion would approximate the inverse CDF function $Q(q)$ as a polynomial in the inverse CDF $\tilde{Q}(q)$ of the standard normal distribution. It reads: $ Q(q) = 121 + \left( -\frac{2}{3} + \frac{2171}{99 \sqrt{2}} \tilde{Q}(q) + \frac{2}{3} \tilde{Q}^2(q) + \frac{1}{99 \sqrt{2}} \tilde{Q}^3(q)\right) $ Now using the method of binary splitting on the calculator it is not hard to find that $Q(q) > 126$ implies $\tilde{Q}(q) > 0.359853$, that is the requested probability.

Indeed, checking with Mathematica, you see that this comes pretty close:

In[27]:= N[  Probability[x > 126, x \[Distributed] ChiSquareDistribution[121]]]  Out[27]= 0.359493 

Compare with the pure normal approximation:

In[29]:= Block[{ch2d = ChiSquareDistribution[121]},  N[Probability[x > 126,     x \[Distributed]      NormalDistribution[Mean[ch2d], StandardDeviation[ch2d]]]]  ]  Out[29]= 0.373949 
  • 0
    Let $X$ be a $\chi^2$ random variables and let $Z$ be a standard normal random variable. What is happening is that we approximate $F^{(-1)}_X( F_Z(q) )$ as a polynomial in $q$. Assuming $X$ is "nearly" normal, it is reasonable to assume that such an approximation is accurate at least when $q$ is sufficiently distant from extremities $q=0$ and $q=1$. Alternatively, you could use [Gram-Charlier A series or Edgeworth series](http://en.wikipedia.org/wiki/Edgeworth_series).2012-10-23