0
$\begingroup$

Suppose there are $n$ points in $\mathbb R^2$. So we have ${n\choose 2}$ line segments joining pairs of these points. How do I find a straight line such that the angles between this straight line and all these line segments are not very high?

The question, as you can see, is not really very clear, and I cannot help better it since this was asked verbatim in an admissions interview. Of course, one ca ask what I mean by "not very high". I apologise -- I do not know what it means.

A hint was given "Think about projections, inner product, etc."

But like fitting least squares regression line, some efforts can be given in understanding this and here are what I think:

  • One can try to minimise $\sum_i\tan^2(\theta_i)$ where $\theta_i$ is the angle between our desired line and the $i$-th line segment. I know the slopes of my line segments, so minimising this quantity will imply finding solution to the slope of my line. This may give me a line, but probably it will not be very standard.
  • I somehow feel that our desired line will be parallel to the regression line, although I can't prove it. Somehow I feel that minimising the angles will effectively mean minimising the projections orthogonal to my line. I am not at all sure about this. But this is the one closest to "projections, inner product" I guess.
  • How about the principal component analysis line? I do not know the theory but will be eigenvalues be of any assistance to me?
  • 0
    If number of vectors are $n$, then number of line segments for pairs of vectors are $O(n^2)$ and at least as bad will any algorithm working on them be.2017-01-30
  • 0
    Algorithm may be bad, but only after a solution is obtained can we think of bettering the algorithm. First let's figure out what is the solution in this case.2017-01-31
  • 0
    Yes, I think the principal component analysis (PCA) approach will be fruitful as you see in my answer. At least it will give a quick estimate.2017-01-31

2 Answers 2

0

The absolute value of the inner product of two unit vectors is the cosine of the angle between them. For each pair, compute $d^k = (x^i-x^j) / ||x^i-x^j||$ ($k=1,\ldots,K={n\choose 2}$), which is a vector of length $1$.

You can minimize the average angle: $$\min_{y\in \mathbb{R^2}} \left\{ \sum_k |y^T d^k| : ||y||=1 \right\}$$ or the maximum angle: $$\min_{y\in \mathbb{R^2}} \left\{ \max_k |y^T d^k| : ||y||=1 \right\}$$ These problems are hard to solve (due to the nonconvex constraint $||y||=1$), but since $$y = \cos(\theta)\begin{pmatrix}1\\0\end{pmatrix} + \sin(\theta)\begin{pmatrix}0\\1\end{pmatrix}$$ you can reduce this to a single variable problem and perform grid search.

0

For approximation you can try a Principal Component Analysis (PCA) (or Karhunen Loéve theorem or sometimes transform). Do it by calculating mean value vector

$${\bf m } = \frac{1}{n}\sum_{i=1}^n{\bf v}_i$$

of your vectors (ortsvektoren) ${\bf v}_i$, and then $${\bf Q} = \sum_{i=1}^n({\bf v}_i-{\bf m})({\bf v}_i-{\bf m})^T$$

and finally find the largest modulus (norm) eigenvector of $\bf Q$. Can for example be done with a power method. All in all $O(n)$ both matrix additions and multiplications.