7
$\begingroup$

I've been struggling with this problem for weeks, and couldn't find an appropriate algorithm to solve it. Could you guys please give me some advices or suggestions in addressing this question. Or if you know anyone who could answer this question, please send it to them. I really need the solution as I have been trying to figure it out for quite long time :| Many thanks!

As can be seen in the picture below, I have more than 200 lines in 3D. They consist of 3 different groups of lines, and the common property of the lines in each group is that they pass close to the same point. However, in this problem, we don't know which line belongs to which group. What is given are only the 3D coordinates (x,y,z) of the 2 ends of all the lines.

My job is to find an algorithm to match/fit/register a triangle (a rigid body) to those lines such that the sum of the sum of distances between 3 vertexes of the triangle and their corresponding lines is minimized. In some cases, one vertex may be the closest point in a group, but the other vertexes would be dragged further away from their corresponding lines (because they are in a rigid body), thus this sum will not be minimal. What I need is the best positions of the 3 vertexes (the best fit of the triangle) to minimize the global sum.

In conclusion, given more than 200 sets of (x1,y1,z1,x2,y2,z2) (which are coordinates of 2 ends of a line), I need to find the best positions of 3 vertexes of a triangle to fit in those lines (given the relative distance between the 3 vertexes).

Without the rigid-body condition, I have successfully determined the positions of the 3 independent points using the Expectation-Maximization (EM) clustering algorithm. However, sometimes 2 or 3 points are assigned to the same position. Then, I realized I had forgotten the "rigid-body" condition. With this added condition, this error will not occur. In the EM algorithm, I can derive the formula for each repositioning step for each point. But when it comes to a rigid-body, I don't know how to figure out the translation and rotation matrix to fit the triangle to the lines.

Thank you for reading this post. Your help would be greatly appreciated! my question

  • 0
    Out of curiosity, incidentally, where does this problem come from?2012-08-30

3 Answers 3

3

Minimizing a sum of distances is a bit more hassle than minimizing a sum of squares of distances. Unless you have a really good theoretical reason to focus on the sum of distances, I would suggest to minimize the sum of squares.

In that case, a simple iterative approach to this might work and converge quickly. You can alternatingly optimize the positions of the points and the assigments of the lines to the points. Start by randomly (e.g. uniformly) assigning each line to a point. Then obtain a first estimate for the position of each point by solving the linear equation you get from differentiating the sum of the squares of its distances to the lines assigned to it. Then reassign each line to the point closest to it, reposition, reassign, ... – this should converge rather quickly; it can't get stuck in a loop because each step that changes the configuration decreases the objective function. If you ever get into a degenerate situation where one of the points has less than two lines assigned to it, just assign a random line to it if necessary, and then place it somewhere on the one line assigned to it.

The same approach should also work if you really do want to minimize the sum of distances, but then you don't get a linear equation in the positioning step, and you'll have to do perform a non-linear optimization for the point positions.


[Update:]

Both Steven and I forgot to address the problem of optimizing the triangle's position, given an assignment of the lines to the points, taking into account the rigid body constraint. Here's how I'd go about this; this approach can be used either with Steven's or with my suggestion for assigning the lines.

I had taken your use of the term "line" at face value, but I see from Steven's answer that since you speak of two ends of a line you may have meant line segments. That would complicate things, and since the diagrams look like you might get away with extending the line segments to lines, I'll only deal with lines.

The distance of a point $\vec p$ from a line through $\vec x$ in the direction of unit vector $\vec n$ can be expressed as $(\vec p-\vec x)\times\vec n$. You can write the objective function as

$ f(\Omega,\vec a)=\sum_{i,j}\left|(\Omega\vec p_i+\vec a-\vec x_{ij})\times\vec n_{ij}\right|^2\;, $

where the variables represent the rigid body motions, $\Omega$ a rotation matrix and $\vec a$ a translation vector, $\vec p_i$ are the three vertices in some reference position of the triangle, and $\vec x_{ij}$ and $\vec n_{ij}$ describe the $j$-th line assigned to the $i$-th point.

We can first differentiate this with respect to $\vec a$ to obtain

$ \sum_{i,j}\left((\Omega\vec p_i+\vec a-\vec x_{ij})\times\vec n_{ij}\right)\times\vec n_{ij}=\vec0\;. $

This is a linear equation for $\vec a$ that you can solve by inverting a $3\times3$ matrix that doesn't depend on $\Omega$, which gives you $\vec a$ as a linear function of $\Omega$. Substituting that into the objective function reduces it to

$ f(\Omega)=\sum_{i,j}\left|(\Omega\vec p_i+\vec a(\Omega)-\vec x_{ij})\times\vec n_{ij}\right|^2\;. $

To minimize this with respect to $\Omega$, I would write $\Omega=\Omega_0\exp(\mathrm iA)$, with $\Omega_0$ the current value of $\Omega$ and $A$ antisymmetric, differentiate with respect to $A$ and set $A=0$ to get a gradient with respect to $A$. Then you can satisfy the constraint by antisymmetrizing the gradient; this gives you the direction of $A$ along which to search, which defines a family of rotations parametrized by a single angle, which you can optimize by standard one-dimensional optimization.

This is an iterative approach. To find the optimal $\Omega$ for a given assignment of the lines to the points, you'd have to iterate until you achieve the desired accuracy, but if you're using this together with my approach of optimizing the assignment of the lines to the points, it might be more efficient to alternate between single one-dimensional optimizations and reassignment of the lines until the assignment no longer changes, and only then iterate the search to full convergence.

  • 0
    @Duc: I added my ideas about optimizing the vertex positions under the rigid-body constraints. Sorry it took me a while. I hope this helps; in case you have follow-up questions, I might not be very responsive during the next few days since I have a busy schedule, but I'll try to check back and see how you're doing with this.2012-08-31
1

Here is a idea:

//For minimization. //Optimization tip with a danger of not finding the correct answer:  //choose V a small number (Maybe iterating over some moderate vicinity   //of a small number outweighs the inaccuracy.)    for all "possible assignments" of lines to each group(:=M) k=1:M       for each group i=1:3    find the average of the coordinates specifying a line     if the "variance_x" Or "variance_y" Or "variance_z" is more than V      ignore the whole k'th assignment       for all "effective possible assignments" of lines to each group(:=N)  j=1:N         for group i=1:3         project all lines on x-y plane          for each line           find all collision points            x_xy = average(x_collisions)         y_xy = average(y_collisions)          project all lines on x-z plane           for each line            find all collision points             x_xy = average(x_collisions)          z_xz = average(z_collisions)        project all lines on y-z plane          for each line           find all collision points             y_yz = average(y_collisions)          z_yz = average(z_collisions)      x_i = average(x_xy,x_xz)         y_i = average(y_xy,y_yz)       z_i = average(z_xz,z_yz)        //Calculate the quantity you want to minimize: sum_of_distances       P_j = sum_of_distances     choose the minimum of P_j 

Wikipedia: for calculating the Variance

  • 0
    An optimization that's only triggered once you visit an assignment of the lines to the points doesn't change the exponential time complexity. Just executing a single instruction for each of the $3^{200}/3!$ assignments of the $200$ lines to the points of the triangle would take many orders of magnitude longer than the estimated age of the universe.2012-08-24
1

I would take a somewhat different approach than joriki's solution: since you know that the lines in each cluster are close to each other, I would compute the matrix of distances between line segments. Once you have that matrix, you can choose a threshold so that the segments naturally cluster into three groups. One simplistic way of doing this would be to choose a value (iteratively) so that the induced graph (where the nodes represent line segments and two nodes have an edge between them only when the distance between segments is less than the threshold) has three connected components; there are easy ways of counting connected components with DFS. More realistically (and more complicatedly), you could choose some slightly looser threshold and then use social clustering algorithms to determine the three groups of lines. Yet another way (loosely related) would be to use an iterative optimization tecchnique to assign lines to groups, where the 'weighting function' being optimized is the sum over lines of the maximum distance from that line to any other line in its current group.

It's very possible that domain-specific knowledge will make this process even more consistent; for instance, if you know that the vertices of your triangle will always be in the vicinity of the centers of your lines then you can 'trim' your segments for the distance computation and eliminate any false-positive short distances based on two lines from different clusters having endpoints near each other.

Once you have your lines clustered appropriately, then finding the optimal position of your triangle should be a relatively straightforward problem in numerical optimization: first, choose an assignment of the vertices of your triangle to the clusters of lines (you'll actually want to repeat the procedure for all six possible assignments, and choose the one that yields the best overall result). Once you've set which clusters are affiliated with which vertices, then associated with any placement of your triangle you can compute a fitness function: sum the (squared) distances between each vertex and all of the lines in its affiliated cluster. The smaller this value is, the better the fit is, and this fitness function shouldn't have too many local minima to get trapped in, so you can use standard optimization techniques (e.g., ideas like gradient descent) to find the best fit.

Gradient descent works for optimization by 'always going downhill' : essentially, from an initial placement and orientation you compute the difference in fitness function among all the various 'directions' you can move, find the direction with the largest negative slope, and move in that direction until you hit a minimum — then repeat this procedure until you're 'close enoguh'. This becomes a little bit trickier since your configuration space does include an orientation (so you'll need a good means of optimizing over your 'sphere of directions'), but the right representation will go a long way towards making that easier. A good starting point will make these searches much more efficient, too — for instance, you might want to approximate vertex positions by finding the centroid of the 'approach points' of each cluster, then start with your triangle in that plane centered about the midpoint of those three centroid points.

  • 0
    @DucTrung Mea culpa - much like joriki, I also forgot the condition that you're fitting to a rigid triangle! The optimization step may be a little more complicated than I've offered, but the key is that once you have your clustering done it's still easy to find a _fitness_ function (for instance, the sum of squared distances from the vertices of your triangle to the lines in their corresponding cluster) that you can minimize using numerical optimization techniques. I'll try and update the details in my answer soon.2012-08-30