I have a black box called $F(t)$ ($~$($P~(X\le t)~$, $X$ is random variable) with me where I don't have any information on the exact expression of $F(t)$. But if I supply a $t\ge 0$ I will get a value of $F(t)$ from the black box as output. I want to calculate $E[X]$. I want to fit $F(t)$ values against a polynomial of the form $\sum a_{i}t^{i}$ and then integrate over a suitable range to get the expectation. Is this a good approach to calculate $E[X]$? Also, how should I choose $t$ values. Suppose I want to choose hundred $t$ values. Should I choose them as equally-spaced.
Calculate $E[X]$ using polynomial approximation of CDF
3
$\begingroup$
probability
polynomials
approximation
-
0It depends on the behavior of F(t). Do keep in mind that polynomials don't have asymptotes... – 2011-11-18
-
0Structure of CDF is usually more close to [sigmoid](http://en.wikipedia.org/wiki/Sigmoid_function) than to polynomials so I wouldn't say it's the best idea. Anyway, if you're working with black-box, why not to approximate $\mathsf E[X]$ directly? – 2011-11-18
-
0Presumably $F(0) = 0$ because if not, getting a good estimate of $E[X]$ might be difficult. Also, you say "Suppose I want to choose hundred $t$ values. Should I choose them as equally-spaced?" You might want to use an _adaptive_ strategy. No point in deciding on the $100$ values ahead of time and asking for, say, $F(0)$, $F(1), \ldots, F(99)$ and getting response $0$ in all cases because $X$ takes on values only in $[200, 800]$, e.g. a GRE score. So, use the first few calls to the black box to learn a little about $F(t)$, and then decide on the strategy. – 2011-11-18
1 Answers
2
It seems to me that this is just a standard quadrature problem and there's no need to invoke any specific approximations. You have
$$\int_{-\infty}^\infty xp(x)\mathrm dx=\lim_{L\to\infty}\left(\left[xF(x)\right]_{-L}^L-\int_{-L}^L F(x)\mathrm dx\right)\;,$$
and you want to approximate that last integral, given the ability to sample values of $F$. You can apply any quadrature methods, e.g. Gaussian quadrature, that seem suitable for the problem; this will implicity approximate $F$ by polynomials, but in a particularly efficient way. You may be able to use any knowledge about the structure of $F$ that you have in choosing the quadrature method or perhaps transforming the integral before evaluating it.
-
0@Dilip: What does that mean? $\infty-\infty$ isn't defined, whereas the expression in parentheses and its limit for $L\to\infty$ are well-defined. If you mean that the individual terms tend to infinity, that's true, but that's not a problem. It might be a problem numerically if you choose $L$ too big and then get cancellation, but since the OP already mentioned integrating over a "suitable range" in the question, I assume that $p$ has approximately bounded support and one can choose $L$ so as to avoid such cancellation problems. – 2011-11-18
-
0@Dilip: Sorry, I still don't understand what you're saying. I agree with everything in your first sentence, I just don't understand why you think this is a problem or why you described this as "tending to $\infty-\infty$". As far as I can tell, your suggestion is equivalent to mine if you replace $-L$ by $0$. – 2011-11-18
-
0I regret not being able to make myself clear. Since you know a lot more about quadrature than I do and feel that there is no issue or problem, I am deleting my comments. – 2011-11-18
-
0@Dilip: I'm not sure I do :-). I was genuinely trying to understand what you were saying. Your suggestion using $1-F(x)$ seemed elegant, and I'd like to incorporate it into the answer; my only problem was that I don't understand why you think there are principal issues with things tending to infinity in my formulation but not in that formulation. – 2011-11-18
-
0For a positive random variable $X$, $$E[X]=\int_0^\infty [1-F(x)]\mathrm dx$$ is a standard result that has been discussed many times on math.SE as well (see, e.g. [here](http://math.stackexchange.com/questions/64186/intuition-behind-using-complementary-cdf-to-compute-expectation-for-nonnegative/64219#64219)) and so you can incorporate it into your answer without necessarily referencing me. In this formulation, writing this integral for $E[X]$ as a difference of integrals and then computing each separately numerically (as a neophyte (I don't mean you!) might try to do) is a cause for concern. – 2011-11-18
-
0joriki: Since $LF(-L)\to0$ and $L(1-F(L))\to0$ (see [here](http://math.stackexchange.com/q/82820/6179)), one may replace $[xF(x)]_{-L}^L$ by $L$ in the RHS of the displayed equation. This might be what @Dilip had in mind. – 2011-11-27