2
$\begingroup$

I've always been thought that the fastest way to sort an array of numbers has complexity $O(n \log (n))$. However, radix sort has complexity $O(kn)$ where $k$ is the number of bits. There are even questions on the internet where it is asked to prove that a sorting algorithm cannot be faster than $n \log (n)$.

Wanted to have a clarification on this. Does radix sort have any limitations? If not, is the lower bound on sorting linear in the number of elements?

  • 2
    Maybe I misunderstand, but isn't the number of bits needed to store $n$ about $\log n$?2012-05-24
  • 3
    nlog(n) is the bound on comparison sorts, where the only thing you are given is a relation R(x,y) which compares x and y. If you have additional information about elements (which is the case with radix sort) you can sort faster.2012-05-24
  • 0
    I'm not sure if this helps but there are (randomised) algorithms that are certainly faster than $\mathcal{O}(nlog(n))$. For example: M. Thorup. Randomized Sorting in O(nloglogn) Time and Linear Space Using Addition, Shift, and Bit-wise Boolean Operations. Journal of Algorithms, Volume 42, Number 2, February 2002, pp. 205-230(26) [reference from wikipedia]. Also I remember in one of our classes the professor was talking about "bucket sort" which has O(n) as "expected" time!2012-05-24
  • 0
    @froggie: If you're counting such things, don't forget that a comparison would also takes $O(\log n)$ time rather than $O(1)$ time in the worst case!2012-05-24
  • 0
    @froggie Almost, take the example of $256_{(10)}$, the number of bits required for it is $9$, contrary to the output of $log_2(256) = 8$. That's because in an $n$-bit binary value, the highest power is $2^{n-1}$ which would be, in the case of 8 bits, $128$. The highest value expressible, therefore, is $2^n - 1$ (for all bits to be $1$, evaluating to $255$). Therefore, the actual number of bits required for $n$ is $log_2(n)+1$ (trunc the fractional part). Therefore, you actually can only express $n$ with that many bits, even though a lot more falls into that range ($n_{max}=2^{log_2(n)+1}-1$).2012-05-24

1 Answers 1

2

The number of bits $k$ cannot be considered constant in general. In fact, if all the $n$ numbers are distinct then $k = \mathcal{\Omega}(\log n)$. Hence, there is no difference between radix sort and other fast sorting algorithms.

More generally, any generic deterministic sorting algorithm cannot better $\mathcal{O}(n \log n)$ complexity. If you have $n$ numbers, then the number of comparisons you need to make is at least $\log_2(n!)$.

  • 0
    Shouldn't that last term be $\log_2(n!)$?2012-05-24
  • 0
    @GerryMyerson Yes. Edited it. Thanks.2012-05-24
  • 0
    If all the $n$ numbers are distinct, isn't $k = \Omega(\log n)$ rather than $O(\log n)$?2012-05-24
  • 0
    @RahulNarain Yes. it should be $\Omega(\log n)$ and not $\mathcal{O}(\log n)$.2012-05-24
  • 0
    @Marvis: When we're counting the $\log n$ cost for Radix sort, we really should also be counting the $\log n$ cost to read/write numbers for a comparison-based sort, as well as the $\log n$ cost to do a compare. Despite your statement, radix sort doesn't do any comparisons -- instead, it can be viewed as breaking comparison apart into its component operations down into its component operations, and working with those in a more efficient way than a comparison-based sort could.2012-05-24
  • 0
    in practice, $k$ is fixed and given by the limitation of a computer architecture, while $n$ can be much bigger than $2^k$ (in case of repetitions). Perhaps, in practice, sorting takes $O(n)$ for radix sort?2012-05-24
  • 2
    The nlogn lower bound is applicable only to comparison based sorts. There are other sorts with almost linear time complexity (of course they have other bizarre shortcomings making them impractical) For example, [Counting Sort](http://en.wikipedia.org/wiki/Counting_sort)2012-05-24