3
$\begingroup$

I'm working on an error estimate for a numerical method, and in the process I've stumbled across the following abstract inequality which I think is true, but am having a hard time proving.

Suppose $\{\phi_i\}_{i=1}^N$ and $\{\psi_i\}_{i=1}^N$ are orthonormal bases of $\mathbb{R}^N$ with $(\phi_i,\psi_i) \ge 0$, and call the "error" between these basis vectors $e_i=\phi_i-\psi_i$. Is there a constant $C$ independent of N such that $ \sum_{i \neq j} (e_i,e_j) \le C \sum_i ||e_i||^2? $ I've written a matlab script to test it for thousands of random sets of orthonormal vectors and found no counterexamples so far for C=1.

4 Answers 4

2

Note that with out loss of generality we can assume $\psi_i$ are the standard basis of $\mathbb{R}^N$. We construct $\phi_i$ as follows. Let $V$ be the vector $V := \sum_i^N \psi_i$, so we have $\langle V,\psi_i\rangle = 1$ and $\langle V, V\rangle = N$. Take

$ \phi_i = -\frac{2}{N}V + \psi_i $

so

$ \langle \phi_i,\phi_j\rangle = \frac{4}{N^2}\langle V,V\rangle - \frac{2}{N}\langle V, \psi_i + \psi_j\rangle + \langle \psi_i,\psi_j\rangle = \delta_{ij} $

and

$ \langle \phi_i, \psi_i\rangle = 1 - 2/N \geq 0$

if $N \geq 2$. ($\phi_i$ is obtained by reflecting $\psi_i$ over the plane $\{\langle\pi, V\rangle = 0\}$.)

Now,

$ \langle \phi_i - \psi_i, \phi_j-\psi_j\rangle = \frac{4}{N} $

So

$ 4(N-1) = \frac{4}{N}\cdot (N^2 - N)= \sum_{i\neq j}\langle e_i,e_j\rangle \leq C_N \sum \|e_i\|^2 = 4 C_N $

only if $C_N \geq (N-1)$.

Of course, using Cauchy-Schwarz one has $2\langle e_i,e_j\rangle \leq \|e_i\|^2 + \|e_j\|^2$, which implies that

$ \sum_{i\neq j}\langle e_i,e_j\rangle \leq {n\choose 2} \frac{2}{N}\sum \|e_i\|^2 $

So we have that the constant $C_N \leq (N-1)$.

And hence we have that the constant $C_N = (N-1)$ is sharp.

  • 0
    Ahh great construction! Thanks.2011-07-05
1

Another try:

First note that by considering $\{-\psi_i\}_{i=1}^N$, we can assume that $e_i=\phi_i+\psi_i$. (This is just for convenience.) We have for $i\not=j$ that$(e_i,e_j)=(\phi_i,\psi_j)+(\phi_j,\psi_i)$ and it follows that $\sum_{i\not=j}(e_i,e_j)=2\sum_{i\not=j}(\phi_i,\psi_j) $We have $\psi_i=\sum_{j=1}^N (\psi_i,\phi_j)\phi_j$ and so $\|e_i\|^2=\sum_{j=1}^N |(\psi_i,\phi_j)-\delta_{ij}|^2=\sum_{j=1}^N |(\psi_i,\phi_j)|^2-2(\psi_i,\phi_i)+1.$ It follows that $\sum_{i=1}^N\|e_i\|^2=\sum_{i=1}^N\sum_{j=1}^N |(\psi_i,\phi_j)|^2-2\sum_{i=1}^N(\psi_i,\phi_i)+N.$ The desired inequality with $C=1$ is equivalently to $\sum_{i=1}^N\sum_{j=1}^N |(\psi_i,\phi_j)|^2+N\ge 2\sum_{i=1}^N(\psi_i,\phi_i)$ or $\sum_{i=1}^N\sum_{j=1}^N |(\psi_i,\phi_j)-\delta_{ij}|^2\ge 0.$

  • 0
    Part of the hypothesis is $(\phi_i,\psi_i)\ge0$, but you lose that when you replace $\psi_i$ with $-\psi_i$, no? Nick, which statement is not true for $C=1$ - are you saying you've found a counterexample to your original conjecture?2011-07-05
1

I am more interested in the case where the two ordered bases have identical signs of oriented volume, but we may study the problem with or without this additional requirement using a unified approach.

For convenience, let the two orthonormal bases be, instead, the columns of the orthogonal matrices $U=[u_1,\ldots, u_n]$ and $V=[v_1,\ldots,v_n]$ respectively ($n\ge2$). WLOG we may assume that $V=I_n$. Let $d_i=u_i-v_i$ and $e=(1,1,\ldots,1)$. We want to find the least possible $C\ge0$ such that $ \begin{eqnarray} 0\le\delta_C(U)&:=& \frac12\left(C\sum_i \|d_i\|^2 - \sum_{i\not=j} \langle d_i,d_j\rangle\right)\\ &=& nC - C\sum_i\langle u_i,v_i\rangle + \sum_{i\not=j}\langle u_i,v_j\rangle\\ &=& nC - (C+1)\sum_i\langle u_i,v_i\rangle + \sum_{i,j}\langle u_i,v_j\rangle\\ &=& nC - (C+1){\rm tr}(U) + {\rm tr}(ee^T U)\\ &=& nC + {\rm tr}\left([ee^T - (C+1)I_n]U\right) \end{eqnarray} $ for all $U$ with $\langle u_i,v_i\rangle\ge0$. Note that $ee^T - (C+1)I_n = Q\Lambda Q^T$ where $ \Lambda = \begin{pmatrix}n-1-C\\&-(C+1)I_{n-1}\end{pmatrix} $ and $ Q = \frac{1}{r}\begin{pmatrix} 1&-1&-1&\ldots&-1\\ 1&r-t&-t&\ldots&-t\\ 1&-t&\ddots&\ddots&\vdots\\ \vdots&\vdots&\ddots&\ddots&-t\\ 1&-t&\ldots&-t&r-t \end{pmatrix} $ with $r=\sqrt{n}$ and $t=\frac{1}{r+1}$. So $ \delta_C(U) = nC + {\rm tr}\left(\begin{pmatrix}n-1-C\\&-(C+1)I_{n-1}\end{pmatrix}Q^TUQ\right). $

If $C\ge n-1$, we always have $\delta_C(U)\ge\delta_C(I_n)=0$. However, when $C, the minimum of $\delta_C$ occurs at $U=U_{\rm opt}=Q(-1\oplus I_{n-1})Q^T = I-\frac2n ee^T$, with $\delta_C(U) = 2[C-(n-1)]<0$. Note that the columns of $U_{\rm opt}$ are exactly the basis found by Willie Wong and we have $\langle u_i,v_i\rangle\ge0$.

What if we impose a further requirement that $\mathbf{\det U = \det V}$? Since in the OP's problem context, $U$ is an estimate of $V$, in practice it may be reasonable to require that $\mathbf{\det U = \det V}$. Under our previous assumption (WLOG) that $V=I_n$, this additional requirement means $\det U=1$. In this case, we see that when $n-1-C\le(C+1)$, or equivalently, if $C\ge\frac n2-1$, we always have $\delta_C(U)\ge\delta_C(V)=0$. When $n-1-C>(C+1)$, if we let $R(\theta)$ denotes the rotation matrix by an angle $\theta$, then $\delta_C(U)=-[n-1-C-(C+1)]<0$ when $U = Q(R(\frac\pi2)\oplus I_{n-2})Q^T$. Therefore, we conclude that $\frac n2-1$ is the least possible $C$ that makes $\delta_C(U)\ge0$ for all $U$ such that $\det(U)=\det(V)$ and $\langle u_i,v_i\rangle\ge0$ for all $i$. Unfortunately, we do not know which nontrivial $U$ (i.e. some $U\not=V$) will make the inequality sharp. Without the constraint $\langle u_i,v_i\rangle\ge0$, $U = Q(R(\pi)\oplus I_{n-2})Q^T$ is a minimizer of $\delta_{\frac n2-1}$. It can be shown that this $U$ will make some $\langle u_i,v_i\rangle$ negative, however.

0

The statement appears to be false, at least for C=1.

I wrote a little optimization script to find values that maximize the value of C=LHS/RHS, and it easily finds counterexamples. For example, the following set of 10 vectors has C=8.523: $ \\ 0.7449 -0.1554 -0.2566 -0.1677 -0.3110 -0.1738 -0.1967 -0.2552 -0.1312 -0.2812 -0.2923 0.7844 -0.1284 -0.1065 -0.2503 -0.2134 -0.2485 -0.2069 -0.1589 -0.1828 -0.1429 -0.2351 0.8301 -0.1749 -0.1849 -0.1480 -0.2192 -0.1936 -0.1031 -0.2287 -0.2150 -0.2567 -0.1478 0.8380 -0.2066 -0.1578 -0.2133 -0.1680 -0.0756 -0.1294 -0.1900 -0.2030 -0.2207 -0.1796 0.7393 -0.2527 -0.1869 -0.2995 -0.2257 -0.2361 -0.2796 -0.2012 -0.2156 -0.2005 -0.2095 0.7855 -0.1406 -0.2357 -0.1624 -0.1790 -0.2483 -0.1787 -0.1399 -0.1415 -0.2742 -0.2752 0.7760 -0.1274 -0.2534 -0.1826 -0.2024 -0.2172 -0.1928 -0.2007 -0.1693 -0.1861 -0.2954 0.7745 -0.1689 -0.2356 -0.2207 -0.1572 -0.1801 -0.2000 -0.1339 -0.1748 -0.0687 -0.1589 0.8643 -0.1691 -0.1656 -0.2320 -0.1494 -0.2343 -0.2270 -0.2360 -0.2391 -0.1995 -0.1552 0.7831 $

I've no idea what it is about these vectors that makes C so large.