2
$\begingroup$

I need to evaluate $$\sum_{i=k}^n {n \choose i} \gamma^i (1-\gamma)^{(n-i)}. $$

But as my $n$ gets often to $10^7$ and $k$ is usually around $0.7n$ it takes significant time to be evaluted about $10^3$ times. Perhaps it can be simplified to avoid the sum? Binomial coefficient itself is probably not the big problem as I used built in function of Wolfram Mathematica.

I care only about some leading digits (and the exponent!) so double is more precise than I need.

EDIT

I have made mistakes when evaluating the values $\gamma$ can have. The real range is from $0.5$ to $0.75$ which makes so much more sense. Please ignore the wrong statement in the comments relevant statements in some answers.

At the moment brute forcing it on 22 cores took me 1 hour. I would be happy if I could speed it up in case I need to reevaluate.

  • 0
    Do you want an approximation (like double precision)? And what is known about gamma? (and the binomial should probably read n over i)?2017-02-16
  • 0
    Double is more than I need, $\gamma$ between $3.5$ and $+\infty$. Typo is fixed.2017-02-16
  • 0
    Sorry, I made a mistake about $\gamma$. See the text.2017-02-16
  • 0
    See updated answer.2017-02-20

2 Answers 2

1

Your sum is a partial sum of hypergeometric function known to have no closed form. If you need exact value nothing can help you. But if you need only some approximation then you can find asymptotic of this sum.

EDIT. As far as I see you compute each $\binom{n}{i}$ independently. That is not a good idea. Note that $\binom{n}{i} = \binom{n}{i - 1}\cdot \frac{n - i + 1}{i}$. So you can compute next summand $a_i = \binom{n}{i} \gamma^i (1 - \gamma)^{n - i}$ using previous one: $a_i = \binom{n}{i - 1} \gamma^{i - 1} (1 - \gamma)^{n - i + 1}\cdot \frac{(n - i + 1)\gamma}{i(1 - \gamma)} = a_{i - 1} \cdot \frac{(n - i + 1)\gamma}{i(1 - \gamma)}$. This should significantly speed up your computations.

(If it will be actual I'll find some time for investigating asymptotics of your function.)

  • 0
    Thank you for the updated answer! I will need to write some code to parallelize this nicely. I will let you know, if this helped.2017-02-21
  • 0
    I guess there is no need in parallelization. You will compute one such sum within 0.1 second. But there are some technical issues. E. g. it would be good idea to go from the largest summand in both direction, because starting summand on each end can underflow and entail zero sum. However this is not a perfect solution in sense of precision.2017-02-21
1

Update: Consider the $i$'th term. For $0<\gamma<1$, all terms are positive. The term will be maximal when $ \frac{n-i}{i} \sim \frac{1-\gamma}{\gamma} $ or $$i_{\rm max}\sim {n\gamma}$$

When $k >> i_{\rm max}$ the main contributions come from the terms close to $k$,

while for $k << i_{\rm max}$ you get 1 minus the sum from 0 to $k-1$ and the main contributions come from the latter.

When $k$ is close to $i_{\rm max}$ you are probably well off by approximating by a gaussian (normal) distribution. But this depends on the accuracy you want? What is the abs/relative error you accept?

  • 0
    Thank you for a thoughtful reply. I am revisiting my reasoning for the $\gamma$. Perhaps I have a mistake somewhere.2017-02-16
  • 0
    Ok, another thought: I think my idea should also work if $i_{max}$ is much smaller than $k$, i.e. suffices to calculate the values in the vicinity of $k$. The difficult part is when $k$ and $i_{max}$ are close. Numerical errors will probably dominate.2017-02-16
  • 0
    Please see the updated question.2017-02-16