34
$\begingroup$

Stirling approximation to a factorial is $ n! \sim \sqrt{2 \pi n} \left(\frac{n}{e}\right)^n. $

I wonder what benefit can be got from it?

From computational perspective (I admit I don't know too much about how each arithmetic operation is implemented and which is cheaper than which), a factorial $n!$ contains $n-1$ multiplications. In Stirling's approximation, one also has to compute one division and $n$ multiplications for $\left(\frac{n}{e}\right)^n$, no? Plus two multiplication and one square root for $\sqrt{2 \pi n}$, how does the approximation reduce computation?

There may be considerations from other perspectives. I also would like to know. Please point out your perspective if you can.

Added: For purpose of simplifying analysis by Stirling's approximation, for example, the reply by user1729, my concern is that it is an approximation after all, and even if the approximating expression converges, don't we need to show that the original expression also converges and converges to the same thing as its approximation converges to?

Thanks and regards!

  • 0
    This triggered a subsequent question and several exhaustive answers in [SciComp](http://scicomp.stackexchange.com/questions/679/which-is-computed-faster-ab-log-a-c-or-sqrtbc) (come visit us!). For the purposes of this discussion, a logarithm in floating-point precision is on the order of 10 times more expensive than than the equivalent multiply (both are vectorizable, though the logarithm will take more effort).2012-01-13

10 Answers 10

26

The purpose (as for all asymptotic expressions) is to replace a "complicated" function (in this case the factorial) with some expression which is "simpler". So you might object that $\sqrt{2\pi n} (n/e)^n$ is simpler than $n!$. But if I ask you the question whether $e^n$ or $n!$ grows faster when $n \to \infty$ you might appreciate Stirling's result. Or try to answer the question, how many digits $n!$ has when $n$ is large. Or ...

  • 0
    @DaveL.Renfro: Thanks!2012-03-29
23

One result from Computer Science:

The minimum number of comparisons needed to sort any $n$ items using a comparison-sort is

$\log_2(n!)$

(this can be seen by taking every possible ordering, and forming binary tree of necessary comparisons of minimum height).

By Stirling's approximation, we have

$\log_2(n!)$ $> \log_2\left(\sqrt{2\pi n}\left(\frac{n}{e}\right)^n\right)$ $= n \log_2\left(\frac{n}{e}\right) + \log_2\left(\sqrt{2\pi n}\right)$

Or, as you will see written it in every Computer Science book ever,

The number of comparisons necessary for any comparison-sort is lower-bounded by $\Omega(n \log_2n)$


Also, a sidenote: OP stated that both $n!$ and $n^n$ require $O(n)$ multiplications to compute. This is false - $n^n$ can be computed quite easily in $O(log_2(n))$

17

Abraham de Moivre was the person who first introduced Stirling's formula. His friend James Stirling is the one who found that the constant is $\sqrt{2\pi}$; de Moivre only knew it numerically.

Demoivre used it to approximate the probability that the number of heads you get when you toss a coin 1800 times is $x$, for $x$ not too many standard deviations away from 900. He wrote about this in his book titled The Doctrine of Chances (google the title!). The title of the book is in effect 18th-century English for "the theory of probability". The phrase appears again in Thomas Bayes' famous posthumous paper "An essay towards solving a problem in the doctrine of chances" (google that title too). Devmoivre derived the bell-shaped curve $ x\mapsto \text{constant} \cdot e^{-x^2/2} $ from this formula.

9

Stirling's approximation is used extensively in Physics, in Boltzmann's description of entropy:

$ S = k \log_e W$

Where W is the number of possible permutations of a group of materials, given by

$W = \frac{N!}{\prod_i N_i!}$

Taking the the approximation makes the maths much easier.

  • 5
    As far as I am aware, thermodynamics only need the weaker estimate $\log(n!) \approx n\log n - n$, or even $\log(n!) \approx n\log n$. The $\sqrt{2\pi}$ part of Stirling's formula is irrelevant, and the estimate $\log(n!) \approx n\log n - n$ can be derived in a much simpler way than the methods used to prove Stirling's formula down to the $\sqrt{2\pi}$.2013-11-10
8

You can use it in Polya's recurrence theorem (I came across it in the first chapter of "Topics in Geometric Group Theory" by de la Harpe). This states,

A simple random walk on $\mathbb{Z}^d$ is recurrent if $d=1$ or $d=2$, and transient if $d\geq 3$.

That is, a walker has probability $1$ of returning to his staring point infinitely often if and only if $d=1$ or $d=2$.

I will give the proof that $d=1$ is recurrent. Look up de la Harpe for the other cases (I'm sure there is a better reference, but you can find these pages of de la Harpe on google books, so it is an available reference...). As you may be able to guess, what comes below is basically just lifted from de la Harpe.

So, there are $2^{2n}$ walks of length $2n$ on $\mathbb{Z}$, and the number of those ending at the origin is $\left( \begin{array}{c} 2n\\ n \end{array} \right)$ (you need precisely $n$ steps to the right and $n$ steps to the left, and you can go either right or left).

This means that the probability, $u_{2n}$, of being at the origin after $2n$ steps is $u_{2n}=\frac{1}{2^{2n}}\left( \begin{array}{c} 2n\\ n \end{array} \right)$. Now, we use Stirling's formula in $\left( \begin{array}{c} 2n\\ n \end{array} \right)$ to get, $u_{2n}\sim \frac{1}{2^{2n}}\frac{(2n)^{2n}e^{-2n}\sqrt{2\pi 2n}}{n^{2n}e^{-2n}2\pi n}=\frac{1}{\sqrt{\pi n}}$ (notice how everything just...cancels...).

Noting that $u_{2k+1}=0$ we have that $\sum^{\infty}_{k=1}u_k=\sum^{\infty}_{n=1}u_{2n}=\frac{1}{\sqrt{\pi}}\sum^{\infty}_{n=1}\frac{1}{\sqrt{n}}=\infty$ because $\sum^{\infty}_{n=1}\frac{1}{\sqrt{n}}=\infty$, as required.

  • 1
    Typographical error: you say there are $2^{2n}$ walks of length $n$ when you meant walks of length $2n$.2013-11-10
8

I'd like to add several points, which do not seem to have been covered by the existing answers.

  • Stirling's formula reduces the question of computing a "special" function (factorial) to an explicit expression involving only "elementary" functions. This definition of "elementary" may sound pretty arbitrary to you. However, all of the functions occurring in Stirling's formula happen to be implemented in hardware, with guaranteed precision and highly-optimized implementations that you are not likely to beat. Of course, no-one is going to compute $(n/e)^n$ by $n$ multiplications: both $\ln$ and $\exp$ are available as "elementary" functions.

You can argue that direct multiplication may happen to work faster for very small values of $n$. However, the programmer of a general purpose mathematical function would normally prefer a fixed time implementation, if it is at all possible. This has the benefit of being able to estimate run times of computations that rely on computing $n!$. Of course, you can always imagine a situation when $n$ is small "most of the time". A good programmer is then likely to come up with a solution involving look-up table of pre-calculated values for small $n$. Once again, one tends to prefer fixed time implementations.

  • You also seem to be concerned about the accuracy. First, more terms are likely to be needed to achieve required accuracy at moderately small $n$. For example, a common floating point data type has about 15 significant digits; one needs about 5-7 terms in Stirling's expansion to get the necessary accuracy for $n\ge12$. For smaller values of $n$ a look-up table is going to be a fastest solution. So, one would design with a compromise in mind, choosing between allowed storage and worst-case speed. Second, the remainder term, omitted in your question, can actually be used to establish the upper bound on the approximation error, see e.g. book "Special Functions" by Temme (1996). Hence, Sterling's approximation can be (and was) proven to be a sound solution for the problem of computing values of the factorial function.

  • Last, but not least. Stirling's formula happens to work just as well in the case when $n$ is not integer, i.e. for computing the gamma function (usually defined with a trivial shift of the argument). It also works for complex arguments (although nowadays this property is not used very often due to the availability of better approximations). It is, therefore, relevant in a much wider context than just as an approximation for values of the factorial function.

6

As others have noted, Stirling's formula replaces a factorial with a combination of exponents and multiplications. This has the benefit of being easier to work with analytically. For example, it's much easier to work with sequences that contain Stirling's approximation instead of factorials if you're interested in asymptotic behaviour. Taking derivatives of Stirling's formula is fairly easy; factorials, not so much. Also notice that if you're considering asymptotics, then expressions like $\lim_{n \rightarrow \infty} \frac{a_n}{b_n}$ where $a_n, b_n$ contain some crazy factorial expressions simplify really nicely.

5

I don't have a definite answer to your question so I come with one example. Consider the following problem:

What is the probability of obtaining between 45 and 55 heads in 100 tosses of a fair coin?

Clearly, the exact answer is $P=\frac{1}{2^{100}}\sum_{k=45}^{55}\binom{100}{k}$ following the binomial distribution, for example. But how large or small is this number? Could we answer if we had only a simple pocket calculator?

First, we note that

$\binom{100}{49}=\binom{100}{51}=\binom{100}{50}\frac{50}{51}$

$\binom{100}{48}=\binom{100}{52}=\binom{100}{50}\frac{50\cdot49}{51\cdot52}$

$\cdots$

$\binom{100}{45}=\binom{100}{55}=\binom{100}{50}\frac{50\cdots46}{51\cdots55}$

hence the computation of $P$ is reduced to the computation of $\dfrac{\binom{100}{50}}{2^{100}}$ (actually, $P=\dfrac{\binom{100}{50}}{2^{100}}\cdot9.15635\ldots$ after some multiplications, divisions and additions with the pocket calculator, which took me about 1 minute).

Stirling's formula is used for approximating $\dfrac{\binom{100}{50}}{2^{100}}$.

$\dfrac{\binom{100}{50}}{2^{100}}=\frac{100!}{2^{100}\cdot\left( 50!\right) ^{2}}\approx\frac{\sqrt{200\pi}\left( \frac{100}{\mathrm{e}}\right) ^{100}}{2^{100}\cdot100\pi\left( \frac{50}{\mathrm{e}}\right) ^{100}}=\frac{1}{\sqrt{50\pi}}\approx0.079788$ hence $P\approx\frac{9,15635\ldots}{\sqrt{50\pi}}\approx.7306$

This is a very good approximation, indeed. The value computed with Mathematica is $.7287\cdots$.

4

Just mentioning another important application...

Stirling's Formula is an integral part of proving the Prime Number Theorem, specifically used in counting zeros in the critical strip. The Riemann zeta function is modified by multiplying it by a few functions, one of which is the gamma function (specifically, $\Gamma(s/2+1)$); this effectively gets rid of the trivial zeros at the negative even integers. A contour integral of the logarithmic derivative of this modified function then gives the number of zeros inside the contour. Noting that $\Gamma(n+1) = n!$, the piece of the logarithmic derivative coming from the gamma function is approximated by... Stirling.

  • 0
    You don't need to count zeros of $\zeta(s)$ in the critical strip to prove the Prime Number Theorem. Knowing that there are no zeros on the line ${\rm Re}(s) = 1$ is enough information about the zeros.2014-02-14
1

A Stirling inequality $(n!)^{\frac{1}{n}} \le \frac{e}{n+1}$ can be used to derive Carleman's inequality from the AM-GM inequality.