1
$\begingroup$

I want to extract useful info from some data and this makes me think how to do it efficiently. I will try to explain the problem with math terms.

If we have a sequence of numbers $A=(a_{1}\space a_{2}\dots a_{n})$ and we want to count the number of inversions, ie the number of pairs $(a_{i},a_{j}), 1\leq i < j\leq n$ and $a_{i}>a_{j}$.

I want to find a good algorithm that will solve this problem.

I am looking for an algorithm that is better from the trivial one that solves the problem in polynomial time.

  • 2
    The obvious algorithm that compares all pairs is already quadratic, so asking for a polynomial time algorithm doesn't make much sense. The question is whether you can do better than quadratic. "Better than polynomial" also doesn't make any sense, since even a constant run time is polynomial.2011-11-12
  • 0
    Sorry about that. I am not really into algorithms but i am trying. I am reading books and i try to apply what i learn.2011-11-12
  • 0
    It would be interesting to describe the real life problem that you are trying to solve.2011-11-13
  • 1
    Why did you edit the question without correcting the errors I pointed out? Also, I believe you mean $1\le i\lt j\le n$ where it says $1\le i\le n$?2011-11-13
  • 0
    I think the question title is misleading and not specific enough. Maybe "Counting inversions in lists algorithmically" is more appropriate. Also, tags "statistics" and "optimization" should be removed in favor of "computer-science".2011-11-13

2 Answers 2

3

As stated by @joriki, the trivial algorithm that compares all values pairwise has runtime in $\Theta(n^2)$.

A better runtime can be achieved by leveraging balanced binary search trees, e.g. AVL trees. Assume you have AVL trees that store in every node $v$ the number of nodes $t(v)$ in the subtree rooted in $v$; this does not change the runtime characteristics of AVL trees.

Now you take your list and an empty tree and insert the elements one by one. If you have no reversals, every element will end up as the right-most node after inserting. Conversly, the number of reversals an element causes can be read off while inserting; whenever you descend to a left child, add $t(v)$ of the right child you did not go to, and add $1$ if the father is properly larger than the element you insert.

The proposed alorithm has runtime in $\cal{O}(n\log(n))$ as insertion in AVL trees is in $\cal{O}(\log(n))$.

Edit: Apparently you can do better for permutations.

  • 0
    That's really cool. I was thinking that i can use mergesort but didn't figure out how. Now i have to read about trees. Thanks for the answer.2011-11-13
  • 0
    @practic_pom: Several of the answers to the questions I linked to in my answer use mergesort.2011-11-13
  • 1
    @practic_pom, if you like this answer, maybe you want to given it a 1 and mark it as an answered question.2011-11-13
  • 0
    @practic_pom You might want to look into "Introduction to Algorithms" by Cormen et al (if you have not done so already).2011-11-13
  • 0
    @Raphael I will, i already ordered that book.2011-11-13
2

The reason you didn't find a solution to this by searching seems to be that the more usual term for what you call a "reversal" is "inversion". Searching for algorithms counting the number of inversions generates lots of hits, including these two questions on stackoverflow with lots of answers:

Counting inversions in an array and How to find the number of inversions in an array.

  • 0
    Very interesting links. I think i am gonna use some code from them.2011-11-13
  • 0
    +1; I knew there was another name for it. Damn you, brain!2011-11-13