22
$\begingroup$

I looked it up online in many sites but none give a clear answer. They all give a lot of complicated mathematical stuff which is not only hard for me to grasp but also irrelevant as I simply want to know what is the upper bound (worst case complexity), lower bound and average time complexity of Euclid's algorithm. Here's what I have collated from a myriad of sites, none say the same thing:

To find $\gcd(a,b)$, with $b, and $b$ having number of digits $h$:

  • Some say the time complexity is $O(h^2)$

  • Some say the time complexity is $O(\log a+\log b)$ (assuming $\log_2$)

  • Others say the time complexity is $O(\log a\log b)$

  • One even says this "By Lame's theorem you find a first Fibonacci number larger than b. If it is $F[k+1]$ there are fewer than $k$ recursive calls. Thus it is $O(\log b)$"

  • All say the worst case is when $a$ and $b$ are consecutive Fibonacci numbers.

I would be very much obliged if you settle the whole thing conclusively by giving me straight & clear answers to the following:

  1. What is the worst case time complexity (upper bound) of the Euclid's algorithm?

  2. What is the average case time complexity of Euclid's algorithm?

  3. What is the lower bound of Euclid's Algorithm (best case) and when does it happen?

You have no idea how much your answer will help me. I am very dejected as I am bogged down by what can be considered a rather simple algorithm. Thanks.

  • 0
    What do you mean by average? Do you mean $a,b$ random below some bound $n$, $a$ fixed and $b$ below $a-1$ or something different? This is not clear from context and there is no standard definition. I think best case is $a$ or $b$ equal to $0$.2012-12-14
  • 0
    Well,I used the term "Average Case" as it is generally used for algorithms like sorting,searching....You know...It will be equally nice of you if you can tell me in a nutshell about each of those 2 cases you mentioned.Else leave the "Average case", and please tell me about the Upper Bound and Lower bound only.Even that would suffice....Tell me whatever you can about it Hans, this thing has made me miserable for half of the day.2012-12-14
  • 0
    @Ivy: Hans asked a very precise question and correctly pointed out that there is no standard definition, so referring him to how the term "is generally used" isn't helpful. In the examples you cited (sorting, searching), there is usually an obvious well-defined interpretation of "average", namely that the items to be sorted/inserted/searched appear in an order uniformly randomly chosen from all possible permutations. There is no such obvious interpretation here.2012-12-14
  • 1
    My lack of knowledge prohibits me from answering this question. But I know complexity analysis is a delicate topic where one has to be very careful when asking and answering questions.2012-12-14
  • 0
    If a guy like you says "My lack of knowledge......" then I wonder where a guy like me stands!!LOL.I am sure I don't even know a fraction of what you know about these things.Well,glad that you said "complexity analysis is a delicate topic".It helps me feel better as I struggle and lag behind in this topic.Now I can assure myself that it's ok to lag behind initially as it's a delicate and difficult topic after all.But why bother, with forums like these,no one is alone....smart people are out there to help strugglers like me!!2012-12-14
  • 0
    To expand on the information here a great resource is ... http://en.wikipedia.org/wiki/Euclidean_algorithm#Algorithmic_efficiency2013-09-23
  • 2
    For future readers: the best resource I have ever seen on this is found in Knuth's AOCP, Volume 2 (Chapter 4.5.1-4.5.3).2014-07-20
  • 0
    There is a paper by Jonassen and Knuth, ["A Trivial Algorithm whose Analysis Isn't"](http://www.sciencedirect.com/science/article/pii/002200007890020X/pdf?md5=3de8ce2a9a195f46df7bdf4e2f061e28&pid=1-s2.0-002200007890020X-main.pdf), Journal of Computer and System Sciences 16:3 (june 1978), pp 301–322. Don't be too hard on yourself.2015-08-17

2 Answers 2

30

To address some preliminaries, let $T(a,b)$ be the number of steps taken in the Euclidean algorithm, which repeatedly evaluates $\gcd(a,b)=\gcd(b,a\bmod b)$ until $b=0$, assuming $a\geq b$. Also, let $h=\log_{10}b$ be the number of digits in $b$ (give or take). (Note that in these calculations, by counting steps, we ignore the question of the time-complexity of the $\mathrm{mod}$ function. If we assume it is $O(1)$, then all of the following also applies to the time complexity of the algorithm.

  1. In the worst-case, as you have stated, $a=F_{n+1}$ and $b=F_n$, where $F_n$ is the Fibonacci sequence, since it will calculate $\gcd(F_{n+1},F_n)=\gcd(F_n,F_{n-1})$ until it gets to $n=0$, so $T(F_{n+1},F_n)=\Theta(n)$ and $T(a,F_n)=O(n)$. Since $F_n=\Theta(\varphi^n)$, this implies that $T(a,b)=O(\log_\varphi b)$. Note that $h\approx log_{10}b$ and $\log_bx={\log x\over\log b}$ implies $\log_bx=O(\log x)$ for any $a$, so the worst case for Euclid's algorithm is $O(\log_\varphi b)=O(h)=O(\log b)$.

  2. The average case requires a bit more care, as it depends on the probabilistics of the situation. In order to precisely calculate it, we need a probability distribution. If $a$ is fixed and $b$ is chosen uniformly from $\mathbb Z\cap[0,a)$, then the number of steps $T(a)$ is

    $$T(a)=-\frac12+6\frac{\log2}\pi(4\gamma-24\pi^2\zeta'(2)+3\log2-2)+{12\over\pi^2}\log2\log a+O(a^{-1/6+\epsilon}),$$

    or, for less accuracy, $T(a)={12\over\pi^2}\log2\log a+O(1)$. (Source: Wikipedia)

  3. In the best case, $a=b$ or $b=0$ or some other convenient case like that happens, so the algorithm terminates in a single step. Thus, $T(a,a)=O(1)$.

If we are working on a computer using 32-bit or 64-bit calculations, as is common, then the individual $\bmod$ operations are in fact constant-time, so these bounds are correct. If, however, we are doing arbitrary-precision calculations, then in order to estimate the actual time complexity of the algorithm, we need to use that $\bmod$ has time complexity $O(\log a\log b)$. In this case, all of the "work" is done in the first step, and the rest of the computation is also $O(\log a\log b)$, so the total is $O(\log a\log b)$. I want to stress, though, that this only applies if the number is that big that you need arbitrary-precision to calculate it.

(This underscores the difference between the mathematician's big-O and the programmer's big-O notation: in the first case, you want the bound to be true $\forall n$, even those $n$ which are so absurdly large that no one can ever write them down or store them in memory, whereas in the second case, programmers are primarily interested in $n\in[0,2^{64}]$, and that's a liberal estimate. For them, it's more important to see the "leading contribution" to the time complexity, and for the Euclidean algorithm, the smaller number drives the difficulty of the calculation by and large.)

  • 0
    You are counting steps? I think he is asking for "overall" complexity, whatever this means.2012-12-14
  • 0
    Thanks Thanks Thanks Mario!!You have answered what I wanted to know, in a numbered format.But I have only one slight confusion over your answer.In 1), you say that Fn is the Fibonacci sequence and then you say Fn=Θ(φn). How can a sequence be equated to a Complexity?Do you mean to say some particular operation on Fibonacci Sequence is Θ(φn).If yes, what operation you mean,that is Θ(φn)?Please clarify that much.(And nevermind my formatting....if possible give me a link where I can learn the markup code to format questions/answers on this forum.I don't know anything about it)2012-12-14
  • 0
    He estimates the size of $F_n$ with $\varphi^n$.2012-12-14
  • 0
    Oh,thanks Hans.By the way what were you thinking about "overall" complexity?What exactly you mean by that?In the context of Euclid's Algorithm we generally mean the number of steps,don't we?2012-12-14
  • 0
    I mean, how expensive is one step? The number of steps only depends on $b$. But the overall complexity must involve $a$ since you have to do something with $a$. If you allow for arbitray large $a$ and $b$ you don't take the mod operation as constant. Hence you will get an overall complexity involving $\log a$ and $\log b$. But don't get me wrong. I think the answer of Mario is superb.2012-12-14
  • 0
    Well, please elaborate with a full-fledged answer then Hans because now I am really confused.Some sites have indeed given the complexity in terms of a and b.Some say O(loga.logb) while others say O(loga+logb).Which one is correct?Lets hope Mario says something about it.Further can you please explain why "for arbitray large a and b you don't take the mod operation as constant.".I repeat, please explain why so.And Hans, as for the expense of one step,wikipedia says it's equal to h, the number of digits in b,the smaller number.Hence the upper bound O(h^2)..as per wikipedia.2012-12-14
  • 0
    The mathematical definition of big-O notation is in terms of the growth rate of functions, not actual time taken to perform operations. When I say $F_n=\Omega(\varphi^n)$, I mean that $F_n\geq A\varphi^n$ for some $A$, and large enough $n$. @HansGiebenrath, it is true that I ignore the time complexity of mod, and say so at the beginning. If you don't, and $a\gg b$, then since the mod operation depends on every digit in $a$ in the worst case, you get $O(\log a)$ for the first step and $O(1)$ for the others, so you *do* get $O(\log a+\log b)$, but the constant is much smaller on the $a$.2012-12-14
  • 0
    Mario,so shall I conclude that all those sites which said things like O(loga.logb) are absurd?Is something like O(loga.logb) totally absurd in this context?And finally,since you have touched upon the time complexity of mod here,shall I hammer it in my mind for my exam that if both number of steps and time complexity of mod operation(each step) are considered,then the complexity is O(loga+logb)?Will this edit of yours be more appropriate than the O(logb) in "1)" if we take the time complexity of mod into account as well?Please answer this much mate.2012-12-14
  • 0
    Sorry, my mistake, you have to divide a with b on that first step, so the contribution is actually $O(\log a\log b)$ and you get $O(\log a\log b)$, not $O(\log a+\log b)$ for the full calculation. To summarize, the time complexity is $O(\log a\log b)$ with mod taken into account and $O(\log b)$ if you are counting steps.2012-12-14
  • 0
    Thanks for clarifying the whole thing Mario.I am glad I joined this Maths forum despite the initial apprehension that my rudimentary questions would be mocked at and remain unanswered......Wait wait wait..I again missed something.Which first step you mean when you said "You have to divide a with b on that first step"?Please mention it down for me as I don't want to learn this formula by rote after all this analysis.How we get O(logalogb) from O(loga +logb)?2012-12-14
  • 0
    What I mean Mario is this : Dividing a by b in O(loga+logb) wont' we get O(loga-logb+logb) or O(loga)?How we get O(logalogb)?2012-12-14
  • 0
    Thanks for the effort Mario. @IvyMike: At each step you divide numbers which are bounded by $a$. Each division is therefore in $O(\log a)$. Since there are $O(\log b)$ divisions, we have $O(\log a \log b)$.2012-12-14
  • 0
    @HansGiebenrath Actually, the time complexity of division or modulo is $O(\log a\log b)$ (or $O(\log^2 a)$ since $a\geq b$) so the naive estimate is actually $O(\log a\log^2 b)$. There is a slightly involved reason to do with the rate at which the numbers are decreasing which explains why the total time complexity is $O(\log a\log b)$ over all steps. See page 94 of http://shoup.net/ntb/ntb-v2.pdf for the proof.2012-12-14
  • 0
    @MarioCarneiro My bad. You are of course right. I always think in terms of "soft"-O, ignoring all logarithmic factors. Then using the algorithm of Schönhage-Strassen division is only $\tilde O(\log a)$. And also thanks for the reference. This book looks really nice.2012-12-15
1

the runtime is linear in the bigger number representation: if $a\ge b$ then runtime is $O(\log(a))$

  • 0
    Assuming mod and div operation can be performed in O(1), if mod and div can performed in O(log(a)) time than the runtime of the algorithm is O(log^2(a))2017-12-21