5
$\begingroup$

Given:

  1. (latitude, longitude) points $P_1, P_2,\ldots, P_n$.
  2. Presumably, all the points should form a dense cloud. However, noise is possible.

Needed:

  • The virtual center of the points.

For instance, 99% of the points may lie within a circle with 1km radius, except for 1% scattered outside that circle at a distance larger than 1km from any point inside the circle. Then this 1% is noise.

Unfortunately, I do not know how to define the noise properly. But the virtual center I am looking for should be close enough to most of the points. If most of the points are close to it, then I do not mind that some be far away.

If it is not too hard, I would like to be able to recognize more than one dense cloud amongst the points. In which case, each cloud could be reduced to its virtual center and thus I will have to find the new virtual super center of the virtual center cloud. That super center is the final result.

I am not a mathematician, so my descriptions are vague. But I am pretty sure that this is a well known problem and it probably has a trivial solution.

Thanks.

P.S.

This question is similar to Detect Abnormal Points in Point Cloud, however, my space is two dimensional, which probably does not matter. Still.

EDIT

The points are indeed on the surface of a sphere, a spheroid actually, Earth more precisely. However, the distance between them is not large enough to take the Earth curveture into account, so it may be safely assume that the surface is flat and longitude is X and latitude is Y.

  • 0
    What you could do is find the center of "mass" of the points (assigning them all equal mass, 1 for instance). Then, you look at what are the points that are most distant from that center. And you throw those points out. Everything will depend though on what you understand by "distant". I think a sensible choice would be assuming a multivariate normal distribution of the points and throw out those that are more than $n\sigma$ away from the center ($n$ being 4 for instance). Then recompute the center of mass after you've thrown out those points.2012-09-14
  • 0
    Could you arrange your reply as an answer? I know wiki helps, still if you could elaborate a bit on `multivariate normal distribution` that would be great.2012-09-14
  • 0
    I don't think my answer really addresses all your issues. It doesn't address the issue of how to distinguish different clouds in the data.2012-09-14
  • 0
    Yep, it is a problem.2012-09-14
  • 1
    Does the LAT/LON indication means that the point are of the surface of a sphere? And the resulting point should also be of the same surface?2012-09-15
  • 0
    Please, refer to the edit of my question.2012-09-15
  • 0
    How is that related to graph theory?2012-09-15
  • 0
    Probably not. I just did not have any idea how to tag it. You are welcome to edit the tags.2012-09-15

2 Answers 2