7
$\begingroup$

This is a two-part question.

Suppose I am drawing random variables $X_i\sim A$, $1\leq i \leq n$ where $A$ is a zero-mean, finite variance $\sigma_A^2$, symmetric probability distribution having finite fourth moment $\mathbb{E}(X^4)$ over the support of real number line. Assume that for all $(i,j)$, $X_i$ and $X_j$ are independent. I am interested in approximating the distribution of the sum of squares $\sum_{i=1}^nX_i^2$ with the normal distribution, for very large $n$. In the second part of the question I relax the assumption that $X_i$'s are identically-distributed, but keep same conditions on each $A_i$.

By Central Limit Theorem (CLT), the sum of these i.i.d. random variables, the random variable $\frac{\sum_{i=1}^nX_i}{\sqrt{n\sigma_A^2}}\xrightarrow{D}\mathcal{N}(0,1)$

where $\xrightarrow{D}$ denotes convergence in distribution. Thus, I can approximate the distribution of the sum by $\mathcal{N}(0,n\sigma_A^2)$ for large enough $n$.

The first part of my question is: what distribution can I use to approximate the sum of the squares of a large number of these i.i.d. random variables $\sum_{i=1}^n X^2_i$? Do a function of it converge to a standard Gaussian in distribution (i.e. given large enough, possibly infinite $n$)? I understand that if $A$ is a Gaussian, then $\frac{1}{\sigma_A^2}\sum_{i=1}^n X^2_i\sim\chi^2(n)$, which can be approximated by $\mathcal{N}(n\sigma_A^2,2n\sigma_A^4)$ for very large $n$ using the asymptotic properties of chi-squared distribution (where $\sigma_A^4$ denotes squared variance.) But what happens when $A$ has the nice properties described above, but is not necessarily Gaussian?

My intuition tells me that that it should converge to a Gaussian, since we are still dealing with a sum of random variables with finite mean and variance. But I'm not sure how to prove that or characterize the distribution in terms of $n$ and $\sigma_A^2$.

Second part of the question is a further generalization on this topic. Now suppose $X_i\sim A_i$ are non-identically distributed. They are all still independent, and $A_i$ are still zero-mean and symmetric, but they all have different finite variances $\sigma_i^2$, and may have a different form. Since the means and variances are finite, Lindeberg's condition is met, which assures us that CLT holds for the $\frac{\sum_{i=1}^n X_i}{\sqrt{\sum_{i=1}^n\sigma_i^2}}\xrightarrow{D}\mathcal{N}(0,1)$. However, again, I am wondering what happens with the sum of squares $\sum_{i=1}^n X_i^2$. Is there a function of it that converges to a nice random variable such as Gaussian in distribution (i.e. for an appropriately large $n$, possibly infinite, does it look Gaussian) If so, to what distribution does it converge to and how can one characterize both the distribution and the function of $\sum_{i=1}^n X_i$ in terms of $n$ and $\sigma_i^2$, and possibly $\mathbb{E}(X_i^4)$? Is the result more attainable if each $A_i$ is a zero-mean Gaussian with variance $\sigma_i^2$?

Again, my intuition tells me that a function of $\sum_{i=1}^n X_i^2$ should converge to a Gaussian, since again we are dealing with the sum of random variables with finite means and variances, which should meet Lindeberg's condition... but is there a proof and how to characterization of this distribution in terms of $n$ and $\sigma_i^2$?

EDITS: I have changed the question after @Michael Hardy answered the first part for me. The second part is still open...

  • 1
    Hall's paper "_Chi Squared Approximations to the Distribution of a Sum of Independent Random Variables_" may be of interest: https://projecteuclid.org/euclid.aop/11769934512015-12-21

2 Answers 2

7

I have a certain degree of discomfort with the expression $\sum_{i=1}^n X^2_i\xrightarrow{D}\mathcal{N}(n,2n\sigma_A^4)$ since $n$ appears on both sides. If one takes a limit as $n\to\infty$, one gets something that does not depend on $n$.

When one says $\sum_{i=1}^n X_i\sim\mathcal{N}(0,n\sigma_A^2),$ it has to mean that $ \frac{1}{\sigma_A\sqrt{n}}\sum_{i=1}^n X_i $ converges in distribution to $\mathcal{N}(0,1)$ as $n\to\infty$, and no $n$ appears in the expression "$\mathcal{N}(0,1)$", which is the limit.

Since $\mathbb{E}(X_i) = 0$, we have $\sigma_A^2=\operatorname{var}(X_i)=\mathbb{E}(X_i^2)$, and $ \operatorname{var}(X^2) = \mathbb{E}(X^4) - \sigma_A^4. $

So if this last quantity happens to be finite then the central limit theorem tells us that $ \frac{\sum_{i=1}^n (X_i^2 - \sigma_A^2) }{\sqrt{n}\sqrt{\mathbb{E}(X^4) - \sigma_A^4}} $ converges in distribution to $\mathcal{N}(0,1)$ as $n\to\infty$.

  • 0
    If the variance of each random variable $X_i$ is \operatorname{var}(X_i)<\infty then it could happen that \sum\limits_{i=1}^\infty \operatorname{var}(X_i)<\infty, or it could happen that $\sum\limits_{i=1}^\infty \operatorname{var}(X_i)=\infty$. In the former case I'm not sure any reasonable generalization of the central limit theorem ever holds (but I'm not sure of that).2011-11-06
1

The central limit theorem states that, for i.i.d. sequence $X_i$ random variables with mean $\mathbb{E}(X)$ and variance $\operatorname{Var}(X)$, the random variable sequence $Z_n = \frac{1}{\sqrt{n \operatorname{Var}(X)}} \left( \sum_{i=1}^n X_i - n \mathbb{E}(X) \right)$ converges in distribution to the standard normal distribution.

CLT makes no statement about distribution of $Z_n$ for finite $n$, though.

  • 0
    could you please expand on the last statement about $Y_n$? How can one prove that, or find the proof? Also, would $\frac{\sum_{i=1}^nX_i^2}{n}$ diverge?2011-11-05