2
$\begingroup$

Suppose $V$ is a $N\times n$ matrix the columns of which are independently distributed uniformly on $\mathbf S^{N-1}$ the surface of the unit sphere in $\mathbf R^N$. I conjecture that $V^TV$ approaches the identity matrix in norm (say, norms that are equivalent to the Frobenius norm), as $N\to\infty$ in expectation or even almost surely. In general, what is the joint distribution of $V^TV$ for given finite $n$ and $N$?

Is the conjecture correct? Do we need the theory of random matrix to obtain the answers?

1 Answers 1

0

I suspect you do not need a theory of random matrices to prove this.

The value in an arbitrary basis of each component of the unit vectors has a distribution depending on $N$: for example with $N=3$ these are uniformly distributed on $[-1,1]$. As $N$ increases, the probability that these values are close to $0$ slowly increases towards $1$; the density is $\dfrac{\Gamma\left(\frac{N}{2}\right)}{\Gamma\left(\frac{N}{2}-\frac{1}{2}\right)\Gamma\left(\frac{1}{2}\right)}(1-x^2)^{(N-3)/2}$ with $x \in [-1,1]$.

But the distribution of these values for given $N$ is also the distribution of the cosine of angles between any two unit vectors in $V$ (imagine deciding the basis so the first vector is $(0,0,1,0,0,\ldots,0)^T$ and then taking the dot product of the other vectors with this). So the distribution of the angles themselves is such that as $N$ increases the probability these values are close to $\frac{\pi}{2}$ or $90^\circ$ slowly increases towards $1$. And so in this sense $V$ approaches orthonormal as $N$ increases, providing if $n$ remains constant. ($n=N$ would be a different question).

Actually doing the simulation, the convergence is indeed fairly slow. It seems that to have a probability greater than $90\%$ that the angle between a pair of random unit vectors is between $85^\circ$ and $95^\circ$, i.e. their dot product is between $\cos\left(\frac{35}{36} \pi\right)$ and $\cos\left(\frac{37}{36} \pi\right)$, you need $N$ to be about $357$.

It would need to be higher if $n \gt 2$ and you wanted all the angles in this range: they will not quite be independent.

  • 0
    You are considering the marginal distribution of the correlation between any two unit vectors in $V$. What you said is true in that case and for $n=2$ and I already know that. However, that is NOT my question. I am NOT asking for the marginal but the joint distribution2017-01-27
  • 0
    @Hans: My point is that if $n$ is constant while $N$ increases without limit then it does not matter: each angle will be likely to be close to orthogonal and since there are a fixed number $\frac{n(n-1)}{2}$ of them then for large $N$ they are all likely to be close to orthogonal2017-01-27
  • 0
    If you read my question again, you will see that that is already my **conjecture**. What do you mean by "likely"? I want a proof not repetition of my conjecture.2017-01-27
  • 0
    Also, I just edited my question to eliminate the ambiguity of what I mean by correlation matrix.2017-01-27
  • 0
    You misinterpreted my "likely". My point is that your conjecture is true in the sense that given any specified measure of "almost orthogonal", there is an $N$ which means that the probability that any pair of random unit vectors are not "almost orthogonal" is as close to $0$ as you might specify. The probability that at least one of the pairs of unit vectors is not "almost orthogonal" is then no more that that probability times $\frac{n(n-1)}{2}$ and therefore for sufficiently large $N$ can be made as small as you might specify....2017-01-27
  • 0
    ... But the unit vectors will almost never be actually orthogonal and the probability that they are a pre-specified "almost orthogonal" is never exactly $1$. That is what I meant by "likely"2017-01-27
  • 0
    You are right. The proof for $n=2$ is easy and I have already figured it out by myself long before I posted the question. I thought the proof for arbitrary $n$ have to consider all $n$ vector simultaneously and would be much more complicated. I realize now that by symmetry I can apply exactly the same argument to each one of the $n\choose 2$ pairs of vectors separately, then argue along a line similar to that of your comment. Then we are done.2017-01-28