6
$\begingroup$

I have some sets of numbers, and I'd like to have a way to talk about how close these sets are to each other. I'm not sure what properties it should have (e.g. does it need to be a metric?). But I hope there are examples of better defined problems like this out there with some good solutions.

Some considerations. {1} should be closer to {1.1} than it is to {2}. And {.9,1.1} should be closer to {1} than {0,2} is. But is {1,1.1} closer to {1} than {2}? I suspect any distance function will have at least one parameter to control the tradeoff between nearness of points and cardinality of the sets.

  • 5
    http://en.wikipedia.org/wiki/Hausdorff_distance2011-09-28
  • 0
    You could consider each set as a uniform probability distribution on a discrete set and consider the [earth mover's distance](http://en.wikipedia.org/wiki/Earth_mover%27s_distance) between them. That article also describes an extension that would be useful if you want to take differences in cardinalities into account.2011-09-28
  • 0
    @RahulNarain Does this really capture the differences in cardinalities well? If I have the sets {1} and {1,1+epsilon}, the EMD will be very small.2011-09-28
  • 0
    @pythonic: I suppose I should have said "uniform *measures*" instead of "uniform *probability distributions*", as they shouldn't sum to unity. If you consider $\mu_1(1) = 1$ and $\mu_2(1) = 1$, $\mu_2(1+\epsilon) = 1$, then the ([modified](http://en.wikipedia.org/wiki/Earth_mover%27s_distance#Extensions)) EMD is $0$ (move $1$ unit of mass from position $1$ to position $1$) plus $\sigma$ (delete $1$ unit of mass from position $1+\epsilon$). This $\sigma$ acts like a tradeoff parameter.2011-09-29

2 Answers 2