7
$\begingroup$

Suppose we are given $k$ points in $\{0,1\}^n$ (using Hamming distance as metric). Consider the two points that have the smallest distance between them. Does there exist any results bounding this distance? In particular I am interested in an explicit function $f(k,n)$ bounding this distance from above. Another way to ask would be to give a lower bound on how many entries the pair with the smallest distance between them are sharing, or how far away from each other can the closest pair be?

Any reference to such a function or related problems will help.

  • 0
    @Patrick Da Silva Indeed thinking of the vectors as real vectors to bound the distance is a useful idea, and gives several non-trivial elementary bounds. But the best bounds for this problem seem to exploit the inherent combinatorial structure.2011-08-01

1 Answers 1

4

I think this is a nice question. As mentioned in the comments, this question underlies the theory of error-correcting codes. A number of nice texts are available on this subject, e.g.: MacWilliams & Sloane and van Lint. Some lecture notes, from the CS perspective, are available here and here.

Equip the boolean hypercube $\{0,1\}^n$ with the hamming metric as usual. A "code" is just a set $C$ of points in the hypercube. (More generally, one can also consider "$q$-ary codes", which are just codes for $\{0,1,\ldots, q-1\}^n$.) It's size is $|C|$ and it's minimum distance is $\min_{x,y \in C, x \neq y} d(x,y)$. The question is now to understand the trade-off between the size and the minimum distance of the code. The exact trade-off is still open, but we do know many bounds.

Lower bounds on the size. I think the Gilbert-Varshamov (GV) bound is the only non-trivial lower bound for the boolean case. See also here. One can show this bound by either considering a random code or a greedy packing of codes. (Actually, we can do slightly better; see the Jiang-Vardy paper for details.)

Upper bounds on the size. Here, there a number of interesting bounds. The most prominent are the Singleton bound (which is of interest mainly when $q$ is "large"), Griesmer bound, Hamming bound, Plotkin bound, Johnson & Elias-Bassalygo bound, and finally the McEliece-Rodemich-Rumsey-Welch bounds (there are two MRRW bounds). I cannot find a suitable link or reference, but see the Navon-Samorodnitsky paper for a proof of the first MRRW bound; this paper gives a delightful application of Fourier transforms to codes. The proofs of some of the elementary bounds in the list can be found in the lecture notes linked above also.

While the there has been no improvement in the lower bound on the rate over random codes, the upper bound has been improved several times, until the MRRW bound (1977). For this reason, it is sometimes conjectured that the GV bound is indeed the best possible for binary codes. It is worth pointing out here that we do have codes beating the GV bound for $q \geq 100$. (The $100$ is just a convenient number. I will find and post the exact result and their reference later.)

I regret that my answer has turned out to be just a bibliography of the known bounds, without any explanation. The question was too broad, and I can attempt a better job if the question is made narrow in scope and more focused (for instance, a separate question on only the elementary bounds, or only the MRRW bounds).

  • 0
    Thanks a lot Srivatsan for the nice answer for the nice references. I know that the question is very broad and a little wavy.2011-07-27