1
$\begingroup$

I'd like to compute a distance between two vectors of $[0, 1]^n$ ($n$ numbers between 0 and 1).

I'm looking for a distance generalizing the Hamming distance which I would have used if the numbers were bits. Actually, I don't need a real distance, only a pseudo-distance, a measure of similarity / divergence.

I'm comparing spectra of audio samples : I take the Discrete Fourier Transform of both, normalize the spectra, and I'd like to compute the similarity of these spectra.

  • 0
    How can you use the Hamming distance when the vectors are made of real numbers between 0 and 1 ?2012-08-31

2 Answers 2

1

There are many choices, particularly if you do not ask that the "distance" function satisfy the triangle inequality.

For example, if the vectors are $(x_1,x_2,\dots,x_n)$ and $(y_1,y_2,\dots,y_n)$ we can use $\sum (x_i-y_i)^2$ (the square of the Euclidean distance). This has some very nice properties that have made it the general favourite ever since the time of Gauss.

Or else you can use the "taxicab" distance $\sum|x_i-y_i|$.

You might wish also to experiment with expressions like $\sum |x_i-y_i|^p$ for various $p$, to see wheether a $p$ other than $2$ gives behaviour that you prefer.

2

What if you used $d(a,b) = \sum |a_i - b_i|$ It is continuous, and has the usual metric properties such as the triangle inequality, and for $a,b\in \{0, 1\}^n$ it is identical to the Hamming distance.

  • 0
    Why not. I'm not sure it's relevant for me, I precise my question.2012-08-31