5
$\begingroup$

Good evening,

I have a doubt concerning the worst case scenario of the quicksort algorithm, based on the number of comparisons made by the algorithm, for a given number of elements. This is part of self-study.

By quicksort, I'm referring to an algorithm that begins by taking the first element of the initial list (the pivot; let's call it $a_1$) and forms two sublists, the first containing those elements that are less than $a_1$, in the order they arise, and the second containing those elements greater than $a_1$, in the order they arise. Then, it puts $a_1$ at the end of the first list. This procedure is successively repeated recursively for each sublist, until each sublist contains one item.

My question is: Every source about this subject seems to agree that the worst case scenario is when the list to be sorted is already sorted and the pivot is the smallest element on the list. In this case, the total number of comparisons is $n(n - 1)/2$, where $n$ is the number of items in the list (for example, see here: http://www.daniweb.com/software-development/cpp/threads/171145).

But it seems to me that the worst case happens when the list is sorted in decreasing order and the pivot is the first element in the list, which, in this case, would be the greatest element on the list.

For example, consider the list $\{4, 3, 2, 1\}$. I will chose the pivot as the first element. Each step of the quicksort will divide the original list as follows, according to the description I gave above:

In the first step, $\{4, 3, 2, 1\}$ is divided into two lists: $\{3, 2, 1, 4\}$ (elements smaller than the pivot plus the pivot appended in the end) and $\{\}$ (elements greater than the pivot: empty list). The first step requires 3 comparisons to find that 3, 2, 1 are smaller than 4.

Similarly, in the second step, the list $\{3, 2, 1, 4\}$ is divided into $\{2, 1, 3\}$ and $\{4\}$, requiring 3 comparisons (to find that 2, 1 are smaller than 3 and 4 is greater than 3). The third step divides $\{2, 1, 3\}$ into $\{1, 2\}$ and $\{3\}$, requiring 2 comparisons. The last step divides $\{1, 2\}$ into $\{1\}$ and $\{2\}$, requiring 1 comparison.

If I sum the number of comparisons for each step, I find $3 + 3 + 2 + 1 = 9$. In fact, if I generalize the above situation to $n$ elements sorted in decreasing order, with the first element as the pivot, I get $n(n - 1)/2 + n - 1$ comparisons.

If the list was sorted in increasing order $\{1, 2, 3, 4\}$ and I chose the pivot to be the smaller element on the list (in this case, the first element), the number of comparisons would be $n(n - 1)/2 = 4(3)/2 = 6$.

9 comparisons is greater than 6 comparisons. Why isn't it the worst case?

Thank you in advance.

3 Answers 3

2

Actually, you and your "source" are both right. The worst case is any case where you happen to select the biggest or smallest element as pivot. Only the already sorted case may be thought of as more of an "epic fail" than others, but otherwise, it doesn't matter.

The performance of quicksort depends on the fact that you can bisect the array into two proportional in size to the original array. If you only put a constant number of elements in one of the parts, you get quadratic complexity.

  • 0
    Do you mean to say that the worst case is when you select the largest or smallest element as the pivot *on every iteration*?2012-01-05
  • 0
    Yes, in every iteration. See the example in Mike Spivey's answer.2012-01-05
7

Typical implementations of quicksort don't append the pivot to one of the lists. Instead, they are implemented (in Haskell notation) as

qsort [] = []
qsort xs = qsort left ++ [pivot] ++ qsort right
    where pivot = head xs
          left  = filter (=x) xs

which keeps the symmetry between the left and right lists (if the list elements are distinct) and results in the same number of comparisons for an initial list that is sorted increasing or sorted decreasing.

Note that the 'worst case' isn't just found for the lists that are sorted in increasing or decreasing order. For example, the following lists all require six comparisons to sort:

[1,2,3,4]
[1,2,4,3]
[1,4,2,3]
[1,4,3,2]   
[4,3,2,1]
[4,3,1,2]
[4,1,2,3]
[4,1,3,2]

In general, any ordering which results in the lest or greatest element being selected as the pivot at every stage of the recursion will give you worst case performance.

  • 0
    This is true, that is what was leading me to the wrong analysis. The pivot shouldn't be appended to the end of the first sublist.2012-01-05
  • 1
    Out of curiosity, Chris, do you know if there is a nice characterization - or perhaps an enumeration - of the permutations on $n$ elements that result in the least or greatest element being selected as the pivot at every stage of the recursion?2012-01-05
  • 0
    I added an answer to this question to my answer to the original question.2012-01-06
  • 0
    @Chris why is this called tail recursive algorithm ?2012-08-08
  • 0
    @Geek This isn't a tail recursive algorithm. It's *recursive*, but not *tail recursive*. The function in tail position is the append function `(++)`. For this to be tail recursive, the function `qsort` would need to be in tail position.2012-08-08
  • 0
    @ChrisTaylor I thought it will be difficult to explain in a comment . So I actually posted a [separate question on SO](http://stackoverflow.com/questions/11864006/why-is-quick-sort-called-a-tail-recursive-algorithm) regarding the same . Can you put an answer there ? Particularly the stack building part of it . The answers that I got so far are not clear enough for me .2012-08-08
5

I think you're misreading the quicksort algorithm. The first pass sorts $\{4, 3, 2,1\}$ into $$\{3, 2, 1\}, \{4\}, \{\},$$ with $\{3, 2, 1\}$ and $\{\}$ to be passed back (recursively) to the quicksort algorithm to be sorted again. The $\{4\}$ doesn't need to be sorted; the point of the first pass of the algorithm was to find the right place for the first element $4$! In the second iteration, then, you're only comparing $3$ with $2$ and $1$, for two comparisons - not three. Similarly, there's only one comparison in the third iteration ($2$ with $1$), for a total of six comparisons.

Generalizing, if the list is in reverse order then it will take $1 + 2 + \cdots + n-1 = \frac{n(n-1)}{2}$ comparisons, the same number as if the list was already sorted. So they are both worst cases.


Added (not part of the original answer): Chris Taylor's answer points out that the worst case of $\frac{n(n-1)}{2}$ comparisons occurs when the least or greatest element remaining is selected as the pivot at every stage of the recursion, and he lists the eight such permutations on four elements for which this happens.

Question: For a given $n$, how many permutations produce the worst case?

Answer: $2^{n-1}$.

There are two choices for $\sigma(1)$ - the least or the greatest of the elements in the permutation. After $\sigma(1)$ is chosen there are two choices for $\sigma(2)$ - the least or the greatest of the remaining elements. After $\sigma(1)$ and $\sigma(2)$ are chosen there are two choices for $\sigma(3)$, and so forth, until we reach $\sigma(n)$, which has to be the one element left. This gives $2^{n-1}$ permutations that produce the worst case.

  • 0
    Thank you. I really was misreading it. But the book I'm using says: "Then a1 is put at the end of the first sublist.". Maybe it was not well written. Now I understand it correctly.2012-01-05
  • 2
    @user1131904: Glad this was helpful. The pivot $a_1$ is put at the end of the first sublist; it's just not passed back to the quicksort algorithm along with the rest of the first sublist. If your book implies otherwise, then it is misleading, and I can see where you were confused.2012-01-05
  • 0
    @MikeSpivey Why this algorithm is called tail recursive ? Aren't we doing the computation at each step of the recursion ?2012-08-08
  • 0
    @MikeSpivey Why is this called a tail recursive algorithm ?2012-08-08
  • 0
    @Geek: I have never heard quicksort called a "tail recursive algorithm," but, then again, I'm not familiar with that term.2012-08-08