4
$\begingroup$

Suppose $X_1, \dots, X_n$ are truncated standard normal variables, truncated so that $X_i \geq 0$ (that is, $X_i$ is drawn as a standard normal, conditional on $X_i \geq 0$)

Let $c_1, \dots, c_n$ be non-negative coefficients.

What does the distribution of $\sum_i c_i Y_i$ look like? Does it have, or approximately have, a standard distribution, such as a truncated normal distribution?


Original question:
Suppose $X_1, \dots, X_n$ are iid Normal random variables, with mean 0 and variances $\sigma_1, \dots, \sigma_n$.

Let $Y_i = \max(0,X_i)$. (So $Y_i$ is a truncated normal random variable).

What does the distribution of $\sum_i Y_i$ look like? Does it have, or approximately have, a standard distribution?

  • 4
    You edited the question, based on my answer, in such a way that makes my answer look irrelevant. This is a bad practice, because it wasted my effort. The accepted way was to leave the original question in place, probably at the bottom of the question.2012-07-25

1 Answers 1

5

This post answers the original question.

Please be careful, $Y_k$ is not a truncated normal random variable, it is censored normal random variable. In particular: $ \mathbb{P}(Y_k = 0) = \mathbb{P}\left(X_k \leqslant 0 \right) = \frac{1}{2} \not= 0 $ meaning that $Y_k$ is not an absolutely continuous random variable. Rather, $Y_k$ can be thought of as the mixture of a degenerate random variable, concentrated at $x=0$, and a normal random variable, truncated to the positive semi-axis.

With this said, $Z = \sum\limits_{k=1}^n Y_k$ is not absolutely continuous either, since $ \mathbb{P}\left(Z=0\right) = \mathbb{P}\left(X_1 \leqslant 0, \ldots, X_n \leqslant 0\right) = \mathbb{P}\left(X_1 \leqslant 0\right) \cdots \mathbb{P}\left(X_n \leqslant 0\right) = \frac{1}{2^n} $ The absolutely continuous part of $Z$ is not equal in distribution to any standard distribution. One can compute the characteristic function of $Z$ rather easily. $ \begin{eqnarray} \phi_{Y_k}(t) &=& \mathbb{E}\left(\exp\left(i \max(0,X_k)t\right)\right) = \mathbb{P}\left(X_k \leqslant 0\right) + \mathbb{E}\left(\exp\left(i X_k t\right): X_k > 0\right)\\ &=& \frac{1}{2} + \exp\left(-\frac{t^2}{2} \sigma_k^2\right) + \frac{2 i}{\sqrt{\pi}} D_F\left( \frac{t \sigma_k}{\sqrt{2}} \right) \end{eqnarray}$ where $D_F(x)$ denotes Dawson's F-function.

Since the random variables $Y_k$ are independent: $ \phi_Z(t) = \phi_{Y_1}(t) \cdots \phi_{Y_n}(t) $

Here is a histogram for several small values of $n$, assuming equal unit variance: enter image description here

You can explicitly see how the cumulative distribution function is not continuous at $x=0$, and how the size of the discontinuity jump decreases as $n$ grows.

  • 0
    +1. Replaced some indexes $i$ by $k$ to avoid the confusion with the complex number $i$. (But you might want to check the formula for $\phi_{Y_k}$ which involves $D_F$.)2012-07-25