3
$\begingroup$

The result for the average distance between 2 points with normally distributed coordinates has already been demonstrated on this site and I found a white paper for the generalized result when these 2 points have $N$ dimensions.

But I am at a loss to compute how the average distance changes as the number of points increases to some $M > 2$. Actually, the ultimate goal is to determine the ratio of the average distance between $M_1$ number of points and the average distance between $M_2$ number of different points (all taken from the same normal distribution).

Ideally the result would be generalized to $N$ dimensions, but if I'm asking for too much, I'd be more than happy to learn of the 2D result.

Thanks for any help...

EDIT: To clarify my request, if $M = 3$, the average distance between the 3 points with (random) normally distributed coordinates would be: $[\textrm{dist}(p_1, p_2) + \textrm{dist}(p_1, p_3) + \textrm{dist}(p_2, p_3)] / 3$ However, if an approximation to the ratio mentioned in paragraph 2 can be found using a statistical approach (perhaps the ratio of two confidence intervals, for $M_1$ and $M_2$ respectively, that the centroid is at the origin), that would work for me. Thanks, Sasha, for this idea.

  • 0
    I was aiming for the mean of pair-wise distances. Forgive my ignorance, but wouldn't the centroids of the two data sets be the same on average, regardless of number of points?2012-11-02

1 Answers 1

1

By linearity of expectation, the mean of the pairwise distances is just the average distance between two points.

  • 0
    Embarrassingly for me, you are correct. Thanks.2012-11-02