17
$\begingroup$

In the analysis of an algorithm this statement has come up:$\sum_{k = 1}^n\log(k) \in \Theta(n\log(n))$ and I am having trouble justifying it. I wrote $\sum_{k = 1}^n\log(k) = \log(n!), \ \ n\log(n) = \log(n^n)$ and it is easy to see that $\log(n!) \in O(\log(n^n))$, but the "$\Theta$" part is still perplexing. I tried calculating the limit: $\lim_{n \to\infty} \frac{\log(n!)}{\log(n^n)}$but the only way that I thought of doing this was to substitute the Stirling approximation in the numerator, and it works, but this would only be justified if $\lim_{n \to\infty}\frac{\log(n!)}{\log(\sigma(n))} = 1$(with $\sigma(n) = \sqrt{2\pi} \cdot n^{\frac{1}{2} + n} \cdot e^{-n} $) and is it? It is certainly wrong that $a_n \sim b_n \ \Rightarrow \ f(a_n) \in \Theta(f(b_n))$ for a general (continuous) $f : \mathbb{R} \rightarrow \mathbb{R}$, since, for example, $a_n =n^2+n, b_n=n^2$ and $f(x) = e^x$ is a counterexample (since $\frac{n^2 + n}{n^2} \rightarrow 1$, but $ \frac{e^{n^2+n}}{e^{n^2}} \rightarrow +\infty$). So here are my questions:

  1. Is the statement true under certain hypothesis on $f$ (for example $f \in O(x)$ would be plausible), and in particular in the case of the $f(x) = \log(x)$?
  2. If not, how can I proceed in the initial proof, using Stirling, or otherwise?

I do not know (and don't want to use) anything andvanced on the Stirling approximation, such as error estimates; I know that $n! = \sigma(n)(1+ O(\frac{1}{n}))$, and not much more.


Any help is appreciated. Thank you.

  • 1
    @user135520 see here https://en.wikipedia.org/wiki/Big_O_notation#Family_of_Bachmann.E2.80.93Landau_notations2015-11-25

6 Answers 6

11

Here's another answer to question 2. By the Stolz–Cesàro theorem,

$\lim_{n\to\infty}\frac{\log(n!)}{\log(n^n)}=\lim_{n\to\infty}\frac{\log(n+1)}{\log(n+1)+\log\left(\left(1+\frac{1}{n}\right)^n\right)}=1.$


For a partial answer to question 1, here's a way to see that $a_n\sim b_n$ implies $\log(a_n)\sim\log(b_n)$ (under reasonable hypotheses such as $|\log(b_n)|>\varepsilon$ for some fixed $\varepsilon>0$ and sufficiently large $n$). Note that

$\frac{\log(a_n)}{\log(b_n)}=\frac{\log(b_n)+\log\left(\frac{a_n}{b_n}\right)}{\log(b_n)}.$

This also implies that if $a_n\in\Theta(b_n)$ and $|\log(b_n)|\to\infty$, then $\log(a_n)\sim \log(b_n)$.

  • 2
    Very nice! I hadn't heard about the Stolz-Cesaro Theorem; but it makes a lot of sense thinking of it as the discrete analog of l'Hopital's rule.2012-01-28
5

Okay, lets work out the maths.

By the trapezoid rule, $\ln n! \approx \int_1^n \ln(x)\,{\rm d}x = n \ln n - n + 1$. Now we find the error in the approximation, which one can compute by Euler–Maclaurin formula. After some crude arithmetic, you can compute that

$ \ln (n!) - \frac{\ln n}{2}= n \ln n - n + 1 + \sum_{k=2}^{m} \frac{\mathcal{B}_k\,(-1)^k}{k(k-1)} \left( \frac{1}{n^{k-1}} - 1 \right) + S_{m,n}, $ where $\mathcal{B}_k$ denotes the Bernoulli's number.

Now taking limits, you can compute $\lim_{n \to \infty} \left( \ln n! - n \ln n + n - \frac{\ln n}{2} \right) = 1 - \sum_{k=2}^{m} \frac{\mathcal{B}_k(-1)^k}{k(k-1)} + \lim_{n \to \infty} S_{m,n}$.

Now since $S_{m,n}$ satisfies the following property, $S_{m,n} = \lim_{n \to \infty} S_{m,n} + O \left( \frac{1}{n^m} \right),$ we get the approximation in the logarithmic form. Taking exponent on both the sides and plugging $m=1$, we get the result: $n! = \sqrt{2 \pi n}~{\left( \frac{n}{e} \right)}^n \left( 1 + O \left( \frac{1}{n} \right) \right).$

If you don't care about the $\sqrt{2 \pi n}$ term, I guess you can just use the approximation $\ln n! \approx \sum_{j=1}^n \ln j$ as follows:

$\sum_{j=1}^n \ln j \approx \int_1^n \ln x \,{\rm d}x = n\ln n - n + 1 = \Theta(n \ln n)$.

  • 0
    Yes, thanks; I was about to ask you to expand on the comment but you beat me to it by answering! I haven't quite figured out the details yet, but I will look at this more closely when I'm done with my upcoming exam.2012-01-28
3

Put $s_n:=\sum_{k=1}^n\ln k$. Then $s_n\leq n\ln n$ and $s_n=\sum_{k=1}^n\log (n-k+1)=n\log n+\sum_{k=1}^n\log\left(1-\frac{k-1}n\right)$ and using $\log(1+t)\geq t-t^2/2$ we get
\begin{align*} s_n&\geq n\log n-\sum_{j=0}^{n-1}\frac jn+\frac{j^2}{2n^2}\\ &=n\log n-\frac{n-1}2-\frac{(n-1)(2n-1)}{6n}\\ &=n\log n-\frac{n-1}{2n}\left(n+\frac{2n-1}3\right)\\ &=n\log n-\frac{n-1}2\frac{5n-1}{3n}\\ &=n\log n-\frac{n-1}6\left(5-\frac 1n\right), \end{align*} so $\frac{s_n}{n\log n}\geq 1-\left(1-\frac 1n\right)\frac 1{6\log n}\left(5-\frac 1n\right),$ which is below bounded by a positive constant, for example $\frac 12$ for $n$ large enough.

  • 0
    @DavideGiraudo What I meant in the counterexample was: $\frac{n^2+n}{n^2} \rightarrow 1$, but $\frac{e^{n^2+n}}{e^{n^2}} = \frac{e^{n^2} \cdot e^n}{e^{n^2}} = e^n \rightarrow \infty$2012-01-28
2

An answer to question 1. involves slowly/regularly varying functions, whose study is sometimes summarized as Karamata's theory. Slowly varying functions are positive functions $L$ such that, for every positive $\lambda$, $L(\lambda x)/L(x)\to1$ when $x\to+\infty$.

To see that such functions make the implication in question 1. true, assume that $f$ is slowly varying and furthermore, to simplify things, that $f$ is nondecreasing. Then, if $a_n\sim b_n$ and $a_n,b_n\to+\infty$, $\frac12a_n\leqslant b_n\leqslant 2a_n$ for every $n$ large enough, hence $ \frac{f(\frac12a_n)}{f(a_n)}\leqslant \frac{f(b_n)}{f(a_n)}\leqslant \frac{f(2a_n)}{f(a_n)}. $ Since the lower bound and the upper bound both converge to $1$, this proves that $f(a_n)\sim f(b_n)$.

Thanks to Karamata's characterization theorem, the same kind of argument may be applied to the wider class of regularly varying functions. These are positive functions $L$ such that, for every positive $\lambda$, $L(\lambda x)/L(x)$ has a finite positive limit when $x\to+\infty$.

Hence the implication: $a_n\sim b_n$ implies $f(a_n)\sim f(b_n)$ if $a_n,b_n\to+\infty$, holds for every regularly varying function $f$.

Examples of slowly varying functions are the powers of $\log(x)$. Examples of regularly varying functions are the powers of $x$ (possibly, times a slowly varying function). Examples of non regularly varying functions are the exponentials of powers of $x$ (possibly, times a regularly varying function).

  • 0
    Thank you, this is very interesting; I hadn't heard of these definitions. Actually regularly varying functions are more what I was after, since I was only asking that $f(a_n) \in \Theta(f(b_n))$ (not $f(a_n) \sim f(b_n)$). So polynomials do work (as one would expect), since they are regularly varying.2012-02-06
2

All $n$ terms are smaller than or equal to $\log(n)$ and at least half are greater than or equal to $\log\left(\frac{n}{2}\right)$. So the sum is between $\frac{n}{2} \log\left(\frac{n}{2}\right)$ and $n\log(n)$. This gives the answer immediately.

  • 0
    @did I just found the same argument at http://math.stackexchange.com/questions/93403/proof-that-sum-limits-i-1k-logi-belongs-to-ok , thanks. It is a completely standard method in undergrad algorithms 101 classes as well.2013-01-14
1

Ok I came up with a (partial) answer to 1. It actually involves proving something stronger: $\lim_{n \to \infty}(\log(b_n) - \log(a_n)) = \lim_{n \to \infty} \log \left(\frac{a_n}{b_n} \right) = \log \left(\lim_{n \to \infty} \frac{a_n}{b_n} \right) = \log(1) = 0$ and when two sequences $x_n, y_n$, with $y_n \rightarrow +\infty$ (but probably definitively "far" from $0$ would have been enough) have the property that $y_n - x_n \rightarrow 0 \Rightarrow |y_n - x_n| \rightarrow0$, then: $0 \leftarrow | y_n| \left|1 - \frac{x_n}{y_n} \right| \ \Rightarrow \ \frac{x_n}{y_n} \rightarrow 1$ because if $\exists \ \epsilon >0, \ \ \ \forall \ n_{\epsilon}, \ \ \exists \ n>n_{\epsilon}:\ \ \left|\frac{x_n}{y_n} - 1 \right| > \epsilon$ then clearly $|y_n | \left|1 - \frac{x_n}{y_n} \right|$ could not converge to zero, since $|y_n| \rightarrow \infty$.
I think what made this proof possible is either the concavity of $\log$ or the fact that $\log \in O(x)$, but I get stuck when dealing with a general function $f$ with one of these properties. Anyway, the question is very much answered and I think Jonas' answer definitely deserves to be accepted :-)