23
$\begingroup$

I looked it up online in many sites but none give a clear answer. They all give a lot of complicated mathematical stuff which is not only hard for me to grasp but also irrelevant as I simply want to know what is the upper bound (worst case complexity), lower bound and average time complexity of Euclid's algorithm. Here's what I have collated from a myriad of sites, none say the same thing:

To find $\gcd(a,b)$, with $b, and $b$ having number of digits $h$:

  • Some say the time complexity is $O(h^2)$

  • Some say the time complexity is $O(\log a+\log b)$ (assuming $\log_2$)

  • Others say the time complexity is $O(\log a\log b)$

  • One even says this "By Lame's theorem you find a first Fibonacci number larger than b. If it is $F[k+1]$ there are fewer than $k$ recursive calls. Thus it is $O(\log b)$"

  • All say the worst case is when $a$ and $b$ are consecutive Fibonacci numbers.

I would be very much obliged if you settle the whole thing conclusively by giving me straight & clear answers to the following:

  1. What is the worst case time complexity (upper bound) of the Euclid's algorithm?

  2. What is the average case time complexity of Euclid's algorithm?

  3. What is the lower bound of Euclid's Algorithm (best case) and when does it happen?

You have no idea how much your answer will help me. I am very dejected as I am bogged down by what can be considered a rather simple algorithm. Thanks.

  • 0
    There is$a$paper by Jonassen and Knuth, ["A Trivial Algorithm whose Analysis Isn't"](http://www.sciencedirect.com/science/article/pii/002200007890020X/pdf?md5=3de8ce2a9a195f46df7bdf4e2f061e28&pid=1-s2.0-002200007890020X-main.pdf), Journal of Computer and System Sciences 16:3 (june 1978), pp 301–322. Don't be too hard on yourself.2015-08-17

2 Answers 2

32

To address some preliminaries, let $T(a,b)$ be the number of steps taken in the Euclidean algorithm, which repeatedly evaluates $\gcd(a,b)=\gcd(b,a\bmod b)$ until $b=0$, assuming $a\geq b$. Also, let $h=\log_{10}b$ be the number of digits in $b$ (give or take). (Note that in these calculations, by counting steps, we ignore the question of the time-complexity of the $\mathrm{mod}$ function. If we assume it is $O(1)$, then all of the following also applies to the time complexity of the algorithm.

  1. In the worst-case, as you have stated, $a=F_{n+1}$ and $b=F_n$, where $F_n$ is the Fibonacci sequence, since it will calculate $\gcd(F_{n+1},F_n)=\gcd(F_n,F_{n-1})$ until it gets to $n=0$, so $T(F_{n+1},F_n)=\Theta(n)$ and $T(a,F_n)=O(n)$. Since $F_n=\Theta(\varphi^n)$, this implies that $T(a,b)=O(\log_\varphi b)$. Note that $h\approx log_{10}b$ and $\log_bx={\log x\over\log b}$ implies $\log_bx=O(\log x)$ for any $a$, so the worst case for Euclid's algorithm is $O(\log_\varphi b)=O(h)=O(\log b)$.

  2. The average case requires a bit more care, as it depends on the probabilistics of the situation. In order to precisely calculate it, we need a probability distribution. If $a$ is fixed and $b$ is chosen uniformly from $\mathbb Z\cap[0,a)$, then the number of steps $T(a)$ is

    $T(a)=-\frac12+6\frac{\log2}\pi(4\gamma-24\pi^2\zeta'(2)+3\log2-2)+{12\over\pi^2}\log2\log a+O(a^{-1/6+\epsilon}),$

    or, for less accuracy, $T(a)={12\over\pi^2}\log2\log a+O(1)$. (Source: Wikipedia)

  3. In the best case, $a=b$ or $b=0$ or some other convenient case like that happens, so the algorithm terminates in a single step. Thus, $T(a,a)=O(1)$.

If we are working on a computer using 32-bit or 64-bit calculations, as is common, then the individual $\bmod$ operations are in fact constant-time, so these bounds are correct. If, however, we are doing arbitrary-precision calculations, then in order to estimate the actual time complexity of the algorithm, we need to use that $\bmod$ has time complexity $O(\log a\log b)$. In this case, all of the "work" is done in the first step, and the rest of the computation is also $O(\log a\log b)$, so the total is $O(\log a\log b)$. I want to stress, though, that this only applies if the number is that big that you need arbitrary-precision to calculate it.

(This underscores the difference between the mathematician's big-O and the programmer's big-O notation: in the first case, you want the bound to be true $\forall n$, even those $n$ which are so absurdly large that no one can ever write them down or store them in memory, whereas in the second case, programmers are primarily interested in $n\in[0,2^{64}]$, and that's a liberal estimate. For them, it's more important to see the "leading contribution" to the time complexity, and for the Euclidean algorithm, the smaller number drives the difficulty of the calculation by and large.)

  • 0
    @MarioCarneiro My bad. You are of course right. I always think in terms of "soft"-O, ignoring all logarithmic factors. Then using the algorithm of Schönhage-Strassen division is only $\tilde O(\log a)$. And also thanks for the reference. This book looks really nice.2012-12-15
1

the runtime is linear in the bigger number representation: if $a\ge b$ then runtime is $O(\log(a))$

  • 0
    Assuming mod and div operation can be performed in O(1), if mod and div can performed in O(log(a)) time than the runtime of the algorithm is O(log^2(a))2017-12-21