6
$\begingroup$

Lets define:

$U=\left \{ u_j\right \} , 1 \leq j\leq N= 2^{L},$ the set of all different binary sequences of length $L$.

$V=\left \{ v_i\right \} , 1 \leq i\leq M=\binom{L}{k}2^{k},$ the set of all different gaped binary sequences with $k$ known bits and $L-k$ gaps.

$A_{M*N}=[a_{i,j}]$ is a binary matrix defined as following:

$$a_{i,j} = \left\{\begin{matrix} 1 & \text{if } v_i \text{ matches } u_j\\ 0 & \text{otherwise } \end{matrix}\right.$$

and finally,

$S_{M*M}=AA^{T}$

now, the question is that:

i) What is the rank of matrix $S$ ?

ii) What is the eigen decomposition for $S$ ?

Here is an example for $L=2, k=1$:

$$U = \left \{ 00,01,10,11\right \} $$ $$V = \left \{ 0.,1.,.0,.1\right \} ^*$$

$$ A = \begin{bmatrix} 1 & 1 & 0 &0 \\ 0 & 0 & 1 &1 \\ 1 & 0 & 1 &0 \\ 0 & 1 & 0 &1 \end{bmatrix}$$

$$ S = \begin{bmatrix} 2 & 0 & 1 &1 \\ 0 & 2 & 1 &1 \\ 1 & 1 & 2 &0 \\ 1 & 1 & 0 &2 \end{bmatrix}$$

For the special case $k=1$, this has been previously solved by joriki and the solution can be found here.

any comments or suggestion is appreciated.

$^{*}$ here dots denote gaps. a gap can take any value, and each gaped sequence with $k$ known bits and $(L−K)$ gaps in $V$, exactly matches to $2^{L−k}$ sequences in U, hence the sum of elements in each row of $A$ is $2^{L−k}$.

  • 0
    iii) what if instead of base $2$, it was base $4$, and hence $N=4^L$ and $M=\binom{L}{k}4^{k}$?2011-04-07
  • 0
    Side note: rank($A$) = rank($A^T$) = rank($A A^T$). And $A A^T$ is a positive semidefinite matrix.2011-04-07
  • 0
    The question should preferably be self-contained; I think putting an asterisk in the question and using a comment as a sort of footnote is not a good idea, at least two reasons being that you can't edit comments and that the comment may become invisible if more comments are added and upvoted.2011-04-08
  • 0
    Your use of "subsequence" is a bit confusing. You could call this "$v_i$ matches $u_j$" (considering $v_i$ as a sort of pattern). If you want to use the term "subsequence", it would seem more natural to me to call $v_i$ a subsequence of $u_j$ than the other way around.2011-04-08
  • 0
    And one more comment: The term "regular matrix" has several well-defined meanings: http://en.wikipedia.org/wiki/Regular_matrix . It seems that you're not referring to any of them, just to the fact that there are regularities in the entries of the matrix? Then it might be better to avoid the term.2011-04-08
  • 0
    @joriki, thanks for the comments. I tried to fix them. please feel free to edit the question body/title to make it more clear. Any comments on the solubility of this ?2011-04-08
  • 0
    I'm working on it :-)2011-04-08

1 Answers 1

3

As M.S. pointed out, $\mathrm{rank}(AA^\mathrm{T})=\mathrm{rank}(A)$, so we can determine $\mathrm{rank}(A)$ instead.

Let $w_{jr}=2u_{jr}-1$, that is, $w_{jr}$ is $\pm1$ according as the $r$-th bit of $u_j$ is $1$ or $0$. Then the vectors $b^S$ defined by

$$b^S_j=\prod_{r\in S}w_{jr}$$

for all $S\subseteq\{1,\ldots,L\}$ form a basis of $\mathbb{R}^N$: There are $N=2^L$ of them and they are mutually orthogonal, since for $S,S'\subseteq\{1,\ldots,L\}$ with $S\neq S'$ and some $s\in S$ with $s\notin S'$ (assuming $S\neq\emptyset$ without loss of generality)

$$ \begin{eqnarray} \sum_{j=1}^N b^S_j b^{S'}_j &=& \sum_{j=1}^N\prod_{r\in S}w_{jr}\prod_{r'\in S'}w_{jr'} \\ &=& \sum_{j=1}^N w_{js}\prod_{r\in S-\{s\}}w_{jr}\prod_{r'\in S'}w_{jr'}\;, \end{eqnarray} $$

which is zero since the sum splits into two parts for $w_{js}=\pm1$ which cancel each other.

Now consider the representation of the symmetric group $S_L$ induced on $\mathbb{R}^N$ by the permutation of the bits of the $u_j$: a permutation $\pi\in S_L$ acts on the set $U$ by permuting the bits of the $u_j$, and this induces a representation $\rho$ on $\mathbb{R}^N$ in which $\pi$ acts on a vector of $\mathbb{R}^N$ by permuting the vector's entries as it permutes the $u_j$. The vectors $b^S$ with $|S|=m$ for fixed $m$ span an irreducible invariant subspace $B_m$, since they are transformed into each other by permutations and each of them can be transformed into any other by a suitable permutation. Thus, $\rho$ decomposes into $L+1$ irreducible subrepresentations $\rho_m$ over the $B_m$, of dimensions $\left({L\atop m}\right)$. The rows of $A$ also span a (reducible) invariant subspace under $\rho$, since they are transformed into each other by permutations. All rows of $A$ are orthogonal to a given $b^S$ iff $|S|> k$: If $|S|> k$, for each row $i$ at least one of the bits in $S$ is unknown in $v_i$ and thus takes on both values under the permutations and causes the inner product to split into two cancelling parts, whereas if $|S|\le k$, there are rows such that all bits indexed by $S$ are known, so that there is no cancellation in the sum.

Without using the decomposition of $\rho$, simply from orthogonality, we can infer

$$\text{rank}(A)\le\sum_{m=0}^k\left({L\atop m}\right)\;.$$

Now the row space of $A$ is not orthogonal to the subspaces $B_m$ with $m\le k$, and is orthogonal to the subspaces $B_{L-m}$ with $m

$$\text{rank}(A)\ge\sum_{m=0}^{\min(k,L-k-1)}\left({L\atop m}\right)\;,$$

and therefore

$$\text{rank}(A)=\sum_{m=0}^k\left({L\atop m}\right)\;\;\text{for $2k+1\le L$.}$$

Numerical results show that this is also true for $2k+1>L$, but I haven't been able to figure out yet how to show in this case that both $B_m$ and $B_{L-m}$ are in the row space for $L-k\le m \le k$.

There was a question on MO whether there's a closed form for this, and the answer was negative.

[Edit:] I just realized that all this representation stuff is actually complete overkill, since we can construct the $b^S$ with $|S|\le k$ explicitly from the rows of $A$: Let $a_i$ denote the row of $A$ corresponding to $v_i$, let $K_i$ be the set of the indices of the known bits of $v_i$, and in analogy to $w_{jr}$ let $z_{ir}=2v_{ir}-1$ for $r\in K_i$; then

$$\sum_{K_i\supseteq S}a_i\prod_{r\in S} z_{ir}$$

is a multiple of $b^S$, and it's non-zero iff $|S|\le k$. Thus, the rank given above is indeed correct for all $k$ and $L$.

  • 0
    Thanks a lot. That is a very interesting result. Can we say anything about eigen vectors and eigen values of $AA^{T}$ using this ?2011-04-08
  • 0
    @mghandi: I realized that we can explicitly construct the $b^S$ with $|S|\le k$ (see my edit). I'll give the eigensystem some thought.2011-04-09
  • 0
    Yes, I was expecting something like that -- the eigenvectors of $A^TA$ are the $b^S$, and you can get the eigenvalues from them; I haven't done the calculation yet, but what you write looks like the right sort of expression. The eigenvalues of $AA^T$ are the same, but I don't have any idea for its eigenvectors yet. I also calculated the rank for a general base $\neq 2$; I'll be posting that shortly.2011-04-10
  • 0
    As this question was mainly on the rank, I posted a separate question for eigen decomposition: http://math.stackexchange.com/questions/32080/eigen-decomposition-of-an-interesting-matrix2011-04-10
  • 0
    there was a typo in my last comment. the eigenvalues are $\binom{L-m}{k-m}*2^{g}$2011-04-11
  • 0
    @joriki: Can you please post the results for general base to complete your answer (which I really appreciate).2011-04-13
  • 0
    @joriki: I just posted a separate question for the general case: http://math.stackexchange.com/questions/32843/eigen-decomposition-of-an-interesting-matrix-general-case2011-04-13
  • 0
    @mghandi: OK, I'll try to post something tomorrow.2011-04-13