3
$\begingroup$

I need to calculate ~1 billion distances between points with ~100 dimensions each. I think calculating these distances (or even distance squared) would be very expensive. How can I approximate the distance using a faster algorithm?

The algorithms I've found online mostly only work in two dimensions.

Thanks!

  • 1
    The factor of a billion will play a bigger role than the factor of a hundred in the computational cost. What do you want these measurements for? If it is ultimately to compute statistics, then you may get good enough accuracy by taking a random sample (e.g. $10^6$) of the distances.2012-11-25
  • 0
    @JohnBentin: I'm trying to run k-means clustering on a set of 100 million points, where k = 5.2012-11-25

0 Answers 0