0
$\begingroup$

I'm not very good at math but I need to know how to get a set of numbers that are closest to each other.

It means that if I have 5 numbers [-1, 2, 7, 11, 3222]. Now it is obvious (to me) without any rules to see that correct numbers would be -1, 2, 7, 11 or even -1, 2, 7. However I have no idea how to implement some kind of rule that will allow programatically filter a set of numbers and get viable ones.

You can clearly see that simple solutions like calculating median from all numbers and then getting closest numbers to it would be wrong.
Maybe if I would take median of all numbers except 3222, calculate median, get all numbers from set that are +10/-10 it would work well, however, how do I make computer know that 3222 does not belong here...

So basically I'm pretty stuck here. As I said I'm not very good with math so don't judge me by asking this silly question.

  • 0
    Would something like [bucket sort](http://en.wikipedi$a$.org/wiki/Bucket_sort) help here? Perhaps not directly. But might give some useful ideas?2012-03-20

1 Answers 1

3

You might one to sort your data in an increasing (or decreasing, I have no problem with that) order and compute a coarse discrete derivative. Then you can split your data in a certain number of groups (let's call them clusters) which are "dense".

For instance, if you have a serie as follows: $[1, 2, 9, 7, 260, 6, 5, -8, 255, -300, -2, 0, 15, -302]$ you can sort them using a quicksort algorithm (you can find some fast implementation) and you get: $[-302, -300, -8, -2, 0, 1, 2, 5, 6, 7, 9, 15, 255, 260]$ (hope I didn't forget any) and apply a simple finite difference filter $[-1, 1]$ on you vector which gives you $[2, 242, 6, 2, 1, 1, 3, 1, 1, 2, 6, 240, 5]$. Having a look at this serie clearly tells you that you have three groups of number and the high peaks in the derivative tells you how to split your numbers.

This is a really naive approach. If you have really large, evolving data, you might want to have a look at (online) clustering algorithm.

Hope it helped