0
$\begingroup$

I need to calculate the distance two points, but these two points contain more than 100 dimensions. With the regular two or three dimension distance formula, we can extend this further to n dimension distance formula, however, this takes some computation time quite a bit. So if possible, I would like to find a way to cut down some computation, as I have quite a number of huge data sets.

My constrains are:

  • In one data set, all points contain exactly n dimension, but n is quite huge, always more than 100.
  • the numerical data in each dimension lies between 0 to 255 only.

With these constrains above, is there a faster way to compute the n dimension distance between each point in this case?

  • 0
    @Hagen: could you elaborate what is ∞-norm ?2012-11-22

2 Answers 2

2

Squaring is generally quite fast, there are faster ways to optimize square roots http://en.wikipedia.org/wiki/Fast_inverse_square_root

You could always offload the computation onto the GPU if you have the freedom to do that. I'm not entirely sure of the situation here

You could also try building a lookup table of square numbers (between 0-255) if they are whole numbers only and keep in memory.

1

Your constraints that the values are betwen 0 and 255 indicate that they are actually one byte integers, not floating point numbers. In that case, squaring might be sped up by a lookup table, but in fact parallel SSE squaring might even be faster CPU-wise. The sum of 100 squares then fits into 3 bytes and taking a single square root is quite fast. However, often you might be able to simply drop taking the square root. For example, if you want to sort points by distance, you could also sort by squared distance. Or to see if a point is closer than 100 units, check if the squared distance is less than 10000.