6
$\begingroup$

I want to find the minimum sum of distances of a point(x, y) from other points which lies in the x-y plane. There are 8 cells which are 1 unit far from any given cell. Here distance between two points is not meant to calculate manhattan distance between those points. Here point (x+1, y+1) is 1 unit far from point (x, y)Example Image to describe the distance between two points. In the above diagram point, point p is 4 unit far from point t. Since there are 4 cells between p and t. Point s is 2 unit far from point p. Since there are two cells between p and s.
The point from where I want to find the sum of distance is one of the points which lies in the x-y plane is one, which gives the minimum sum of distances.
Example image of points lying in x-y plane For example. In the above example image, point g(3,2) is in the minimum distance from all other points. The resultant distance is 2(a-g)+ 1(b-g)+ 2(c-g)+ 1(d-g)+ 1(e-g)+ 2(f-g)=9. My objective is to find the minimum sum of distances of a point, which is 9 in this example. Thank you.

To find out the minimum sum of distances, my idea is is get that point first, which is close to all other points. Once i get the point, i can get the distance from all points to that point and finally sum up. But i don't know how can i get the point which is close to all other points.

  • 0
    This is very unclear! The sum of the distances from $g$ to the other points is $3 + 2\sqrt 2 + 2 \sqrt 3$, which is certainly not equal to $9$. Why is the distance from $(x_1,y+1)$ to $(x,y)$ equal to $1$? Try to formulate the problem more clearly.2012-04-13
  • 0
    @TonyK: OP is not using the usual metric, as he explains. His metric is the maximum of the difference in x and the difference in y. The calculations are correct2012-04-13
  • 0
    @RossMillikan: Here the distance is not Manhattan Distance, it's [Chebyshev Distance](http://en.wikipedia.org/wiki/Chebyshev_distance) But my question is to get the minimum sum of distance of a point from others. We have to choose this specific point from all the given points. Once this specific point is selected, one can calculate the distance easily. But how to choose this specific point. In the above example, the specific point is `point g`. which gives the sum of minimum distance.2012-04-14
  • 1
    @RaviJoshi: The medians give the optimal point in the plane if you are not restricted to a point within your set. Then I believe you can just look at points which are "close to" that point in your set, but do not know how to make "close to" precise.2012-04-14
  • 0
    @RossMillikan: From [Wikipedia](http://en.wikipedia.org/wiki/Geometric_median) says that _The geometric median of a discrete set of sample points in a Euclidean space is the point minimizing the sum of distances to the sample points._ Will this help me?2012-04-14
  • 0
    Not specifically, as your space is not Euclidean, and it says there is no easy way to calculate it anyway.2012-04-14
  • 0
    okay. So first i should calculate `(x1+x2+x3+x4...+xn)/n` and `(y1+y2+y3+y4...+yn)/n` . This will give me center point. Then i have to find nearest point somehow.2012-04-14

2 Answers 2

3

Since you want a point from the given points (based on your comments to Ross' answer) here is an $O(n \log n)$ time algorithm, based on the idea of rotating by $45^{\circ}$ and using the equivalence with Manhattan Distance. This only works for the 2D plane.

Assume the points are $(p_n, q_n)$. You transform them to $(q_n - p_n, q_n + p_n) = (x_n , y_n)$ say.

Now we will see how to compute the $x$ component of the Manhattan distance for each point in $O(n \log n)$ time.

We will assume that $x_1 \lt x_2 \lt ...\lt x_n$. This can be done by sorting them. Also, the algorithm can be modified to deal with the case when they are not distinct. For simplicity, we assume they are unique.

Define $L_m = \sum_{j=1}^{m} x_j$ and $R_m = \sum_{j=m}^{n} x_j$. Note that $R_m = L_n - L_{m-1}$. (Define $L_{0} = 0$)

Note that the $L_i$ and $R_i$ can be computed in $O(n)$ time, by looping through the array and maintaining the cumulative sum.

Now for a given $x_k$ we have that $\sum_{j=1}^{n} |x_k - x_j| = \sum_{j=1}^{k-1} (x_k - x_j) + \sum_{j=k+1}^{n} (x_j - x_k) = R_{k+1} - L_{k-1} - (n-2k+1)x_k$

(This is where we used the distinctness assumption, but can be modified easily to deal with it).

Similarly do with $y_k$.

Thus we can compute the sum of (manhattan) distances for each point to other points in total $O(n)$ time.

Finding the minimum among them is another $O(n)$ (or just keep track while computing the distances).

  • 0
    @RaviJoshi: Yes, that is right.2012-04-17
  • 0
    (Sorry for deleting the previous comment, i was just trying to edit that comment but getting some error. Please don't mind.)Thanks for the answer. Hey, In `Lm` formula upper limit of `Σ` should be `n`(I am not sure, please check it once). Correct me if i am wrong, _For every point i have to calculte the sum of `manhattan distances` then the minimum sum is the required answer._2012-04-17
  • 0
    @RaviJoshi: Actually in $L_m$ the upper limit must be $m$. I will edit.2012-04-17
  • 0
    Yeah. well, as `Rm` and `Lm-1` are related to each other by the equation `Rm=Ln-L(m-1)`. So i need to calculate `Ln` once.2012-04-17
  • 1
    @RaviJoshi: You need to calculate $L_1, L_2, \dots ,L_n$. And those can be calculated by making one pass over $x_i$.2012-04-17
  • 0
    @Aryabhata can u tell me the name of dat algorithm wich u explained .plz do reply i need it urgently2012-09-04
  • 0
    @deep: Can you be more specific?2012-09-05
3

If you are not restricted to the origin being one of your points, you want to transform the points $(x_i,y_i)$ to $(x_i+y_i,x_i-y_i)$, take the median of all the transformed coordinates to get $(x'_m,y'_m)$, then transform back to $(\frac{x'_m+y'_m}2,\frac{x'_m-y'_m}2)=(x_m,y_m)$. This is explained a bit under "An optimality property". If you are restricted to one of your points, you need to look around a bit, but that gives a starting point.

Added: this has been corrected for the metric in use instead of the Manhattan metric

  • 0
    This is not the manhattan distance...2012-04-13
  • 0
    @Aryabhata: You are right. This is not manhattan distance. `Point (x+1, y+1)` is `1` unit far from `Point (x, y)`.2012-04-13
  • 0
    @Ross Millikan: These points can be situated anywhere in x-y plane. I mean `x ε {-∞, +∞}` and `y ε {-∞, +∞}`. Will median works here correctly?2012-04-13
  • 1
    But it is the Manhattan distance rotated $45^\circ$.2012-04-13
  • 0
    @RahulNarain: Here if first i would get the point which is closer to other points, then i can easily get the distance from other points. This is the minimum distance. but again how can i get the point which is closer to other points.2012-04-13
  • 0
    @RahulNarain: What do you mean? Distance between $(x_1,y_1)$ and $(x_2, y_2)$ is $\max\{|x_1 - x_2|, |y_1 - y_2|\}$ I think. Curiously, that turns out to satisfy the triangle inequality!2012-04-13
  • 0
    It seems that the OP's metric is derived from the norm $|(x,y)| = \max(x,y)$. If this is the case, I don't think the median gives the right answer.2012-04-13
  • 0
    @Aryabhata: Yes, I think it's the $L_\infty$ norm. [Wikipedia has a nice picture illustrating what I mean.](https://en.wikipedia.org/wiki/File:Vector_norms.svg) So if you rotate everything by $45^\circ$, perform Ross's procedure, and rotate back, you should get the right answer.2012-04-13
  • 0
    I thought you could essentially work in each axis separately, but that is assuming the Manhattan distance2012-04-13
  • 0
    @RahulNarain: This is true when we just consider the distance from $0$, here we are taking the sum from multiple different points. Why should that still work?2012-04-13
  • 0
    @Aryabhata, because $\lVert\mathbf p-\mathbf q\lVert_\infty = \lVert\mathbf R(\mathbf p-\mathbf q)\rVert_1 = \lVert\mathbf{Rp}-\mathbf{Rq}\rVert_1$, where $\mathbf R$ is rotation by $45^\circ$.2012-04-13
  • 0
    @RahulNarain: Yeah. Duh! I have been making calculation mistakes left and right.2012-04-13
  • 1
    @RossMillikan: Rahul is right. You can transform the point $(x_n, y_n)$ to $(y_n -x_n, y_n + x_n)$ and get the medians and then transform back. I suggest you edit the answer.2012-04-13
  • 0
    /me high-fives @Ross2012-04-13
  • 0
    @RahulNarain: Would like to mention: that was a neat observation :-)2012-04-13
  • 0
    @Aryabhata: Can you please explain me in detail what is the answer and what are the steps to get the answer? I am confused.2012-04-14
  • 0
    @RaviJoshi: If the distance was Manhattan (difference in x-coordinates + difference in y-coordinates), then you can minimize $\sum |x - x_i|$ and $\sum |y - y_i|$ independently. And those can be done by finding the median of the $x_i$ and $y_i$. What Ross/Rahul show is that your problem can be changed to a problem of Manhattan distance, if you transform the points as in the answer: $(x_i,y_i) \to (y_i-x_i, x_i + y_i)$. (Basically rotate by 45 degrees and scale to avoid the $\sqrt{2}$ factor).2012-04-14
  • 0
    @Aryabhata: From Wikipedia, this is [Chebyshev Distance](http://en.wikipedia.org/wiki/Chebyshev_distance). Distance between `point g(3,2)` and `point a(1,1)`=`max(3-1,2-1)=2`. As far as the problem is concern, it is about to get the minimum sum of a point from others. In the above example, if anyhow i can choose `point g` from all points, then `point g` gives the minimum sum of distances. Because `point g` is close to all other points w.r.t. other points. Now the problem is just to get the `point g` anyhow. As said by @Ross , median can give a closest point but not the exact point. what to do?2012-04-14
  • 1
    @RaviJoshi: The median will give an exact point. First transform: $(x_n, y_n) \to (y_n - x_n, x_n + y_n) = (p_n , q_n)$. Find the median of $p_n$, say $p$. THe median of $q_n$ is $q$. Now transform back $(p,q)$ to $((q-p)/2, (q+p)/2)$ and this the point you are looking for. This is an exact point. Do you want the point to be among the original given points?2012-04-14
  • 0
    Ya. The `point (q-p)/2,(q+p)/2` must be a point from original given points.2012-04-15
  • 0
    @RaviJoshi: Then the problem is straighforward, isn't it?2012-04-15
  • 0
    how come? Will the `point (q-p)/2,(q+p)/2` always lies in the given points? are you sure?2012-04-15
  • 0
    @RaviJoshi: No, I mean just compute the sum of distances for each point and pick the one which gives the minimum.2012-04-16
  • 0
    This is right but as far as the method is concern, this seems to be a hit a try approach. More appropriately i should say this is a brute force approach, where we are trying every possible point and calculating the sum. isn't? Any better approach?2012-04-16
  • 0
    @RaviJoshi: Brute force is Theta(n^2) which is usually a reasonable time for computation geometry problems. In any case, this can be done in O(n log n) by converting to Manhattan distance. Finding the x_sum for all points can be made O(n) after sorting. Similar the y_sum. I will add an answer later perhaps. btw, you need to put the @ Aryabhata (no space in between), otherwise I won't get notified.2012-04-16
  • 0
    @Aryabhata: Thank you very much. Would be easy, if you can add your answer. Thank you. Waiting for your answer. :)2012-04-17
  • 0
    @RaviJoshi: I have added an answer.2012-04-17