14
$\begingroup$

I want to estimate probability $Pr(X \leq a)$, where $X$ is a continuous random variable and $a$ is given, only based on some moments of $X$ (e.g., the first four moments, but without knowing its distribution type).

  • 0
    The question as it stands is too vague. For instance, what does 'some' mean, when you say some moments of $X$? Is X a discrete or absolutely continuous random variable or ....2012-01-12
  • 0
    @user21436 : I would take the question to mean: what range of values of $\Pr(X\le a)$ is possible given specified values of the first four moments.2017-07-05

4 Answers 4

9

The entire sequence of moments of a random variable $m_k = \mathbb{E}(X^k)$ determines the distribution function of $X$ uniquely, provided that $\sum_{k=0}^\infty \frac{m_k}{k!} t^k$ converges for all $t$ in an open neighborhood of $t=0$. See this.

If you have two such sequences, which coincide up to order $r$, but differ afterwards, these sequences correspond to different distributions.

You may, however, ask to approximate the distribution function $F_X(x) = \mathbb{P}(X \leq x)$ given the values of the low order moments, if some assumptions on the nature of the distribution is made. See method of moments estimation, for example.

Knowledge of moments, determines an upper bound on the tail of the distribution function. See Chernoff bound, and Chebyshev inequality.

You may also find Pearson distribution, determined by first 4 moments, useful

6

Let the first four moments be $E(X^j) = m_j$, $j=1\ldots 4$. Suppose $g(x)$ is a polynomial of degree $d$ so that $g(x) \ge I_{x \le a}$, i.e. $g(x) \ge 1$ for $x \le a$ and $g(x) \ge 0$ for all real $x$. Then for any random variable $X$ such that $E[X^d]$ exists, $P(X \le a) = E[I_{X \le a}] \le E[g(X)]$, and $E[g(X)]$ can be calculated using the first $d$ moments of $X$, i.e. if $g(x) = \sum_{j=0}^d c_j x^j$, $E[g(X)] = c_0 + \sum_{j=1}^d c_j E[X^j]$. Moreover, suppose $g(x) = I_{x\le a}$ at some points. Then this upper bound is optimal in the sense that it gives the exact value of $P(X \le a)$ for any probability distribution concentrated on those points. In the case $d=4$, for any $b_1$ and $b_2$ with $b_1 < a < b_2$, there is a unique polynomial $g(x)$ of degree 4 with $g(b_1) = g(a) = 1$, $g'(b_1) = g(b_2) = g'(b_2) = 0$; this will satisfy $g(x) \ge I_{x \le a}$, and the corresponding estimate is tight for distributions concentrated on $\{b_1, a, b_2\}$. For example, with $b_1 = -1$, $a=0$ and $b_2 = 1$, $g(x) = \frac{x^4}{2} + \frac{x^3}{4} - x^2 - \frac{3x}{4} + 1$, leading to the estimate $P(X \le 0) \le \frac{1}{2} E[X^4] + \frac{1}{3} E[X^3] - E[X^2] - \frac{3}{4} E[X] + 1$.

4

This will be an incomplete answer based on things I thought about several years ago, and I can't remember all the details. The first $n$ moments determine the first $n$ cumulants and vice-versa. Given the cumulants up to the ($2n-1$)th one, there is a constraint on the set of possible values of the $2n$th cumulant, saying that is is $\ge$ to a particular number. If it's less than that number, then there is no probability distribution with that sequence of the first $2n$ cumulants; otherwise there is one. And right at the boundary, it's a discrete distribution that can take only finitely many possible values. And I think every distribution with finite support is realized in that way. Peter McCullagh's book Tensor Methods in Statistics has some material on this. If I wanted to work out from scratch the answer to the original question above, that's where I'd start thinking about it. What you'd probably want as an answer is inequalities that $\Pr(X\le a)$ would have to satisfy.

If the distribution is supported on a not-necessarily proper subset of $[0,1]$, then (if I recall correctly) the way in which the values of the cumulative distribution function depend on the sequence of all of the moments is explicitly worked out somewhere in Feller's famous book. Maybe I'll find it.....

  • 0
    This is very helpful.2017-05-10
  • 0
    @AlexanderGiles : I'm glad it helps.2017-05-10
2

As I had pointed out in my comments, it's hard to answer this question in generality. So, I'll just point you to a resource online.

But, that said, the magic words are generating functions-Probability generating functions and Moment Generating Functions.

The probability generating functions $\Phi_X$ exists only for non-negative integer valued random variables. The Moment generating function $M_X$ is related to the former [whenever and wherever both exist] by the following: $$M_X(t)=\Phi_X(e^t)$$

There are other inputs required, sometimes and sometimes not. So, please go through the material I have pointed you to.

EDITED TO ADD: I'll get a little specific now: If the random variable at hand has finite range, and you have $all$ the moments, then the distribution of $X$ can be found out, {Theorem 10.2, pp 5, 369 in the typeset}. If you just have first two moments, you'll get only Mean and Variance.

I'd love to hear from you incase you have specific queries. [Just add a comment below, I'll be notified!]

  • 1
    *the magic words are generating functions*... No.2012-01-12
  • 0
    @DidierPiau If you have the moment generating function, you have all the moments and hence you can density, no?2012-01-12
  • 2
    Take log-normal variable. Moments of all orders exist and are finite, but they do not determine the distribution uniquely, because the moment generating function is not analytic at the origin2012-01-12
  • 0
    @Sasha Yeah, you're right. But, in the EDIT section, I have written that $X$ must have a finite range. And, as I said, this question is in general a hard one to answer. So, evidently, the case of discrete random variables is done in some sense.2012-01-12