1
$\begingroup$

What types of basic variations of the Hamming code are there and what are their objectives? I was taught the following version:

$$ L = n + k $$ $$ n \geq \log_2M $$ $$ k \ge \log_2(n+k+1) $$

where $M$ - number of alphabet symbols, $L$ - length of a codeword, $n$ - number of information bits, $k$ - number of parity bits. How does such a variation change the analysis and the algorithms involved as compared to the canonical way of coding?

Edit summary:

  • I switched the notation to the one I was provided with originally.
  • I switched the logarithm base from implicit $a$ to explicit 2.
  • 0
    Whomsoever is teaching you is using very idiosyncratic notation: $d$ is very commonly used to denote Hamming distance, or in some cases, minimum Hamming distance between codewords. Also, what is $a$ in $\log_a$?2012-08-31
  • 0
    @DilipSarwate I think $a$ indicates if it is binary or triple etc.. if binary then $a=2$ i guess. I agree that $d$ should me minimum hamming distance.2012-08-31
  • 0
    @DilipSarwate I am Sorry, I recklessly altered the notation. I will correct that in a minute.2012-08-31
  • 0
    @Seyhmus Nonbinary Hamming codes (meaning single error correcting codes) are different from binary Hamming codes. For example, a _canonical_ single error correcting Hamming code over $\mathbb F_q$, $q > 2$, does not have length $q^m-1$ while the binary Hamming code does have length $2^m-1$. So, if $a > 2$, we need to be more careful about the parameters.2012-08-31
  • 0
    @DilipSarwate if it is so, I will explicitly make my $a$ equal 2 because that is the case I am interested in.2012-08-31
  • 0
    Oh good grief! The notation has taken a turn for the worse. Now we have a $[L,n]$ binary code with $k$ denoting the number of _parity_ bits instead of the very commonly used $[n,k]$ linear code with $2^k$ codewords or $(n,M)$ code with $M$ codewords of length $n$. Most people would say the _alphabet_ is binary, not $M$-ary, reserving $M$ for the number of _messages_ and so on.2012-08-31
  • 0
    @DilipSarwate That's why I initially provided the alternative notation - the original one seems like nothing I have seen in the textbooks. Please edit my question if you think switching the notation might help improve its readability. Do you have any idea what sort of Hamming code variation is it? The notation alteration might be due to the fact that my lecturer comes from Russia.2012-08-31

1 Answers 1

3

A canonical binary Hamming code consists of $2^{2^m - 1 - m}$ binary vectors of length $2^m - 1$. These vectors form a $(2^m - 1 - m)$-dimensional vector subspace of the $(2^m-1)$-dimensional vector space $\mathbb F_2^{2^m-1}$. The code is sometimes referred to as a perfect code. Each of the $2^{2^m-1}$ possible received vectors is either a codeword of the Hamming code or is at Hamming distance at most $1$ from exactly one codeword. Thus, every possible received vector is decodable using a nearest-neighbor strategy: map received vector $y$ into the nearest codeword $z$. If there were no transmission errors or there was one transmission error (that is, at most one of the $2^m-1$ transmitted bits got flipped from a $0$ to a $1$ or from a $1$ to a $0$, the $z$ that the decoder finds is the same as the $x$ that was transmitted. On the other hand, if more than one transmission error occurred, the $z$ that the decoder finds is not the same as the $x$ that was transmitted, and a decoder error occurs. It is important to understand that the decoder is quite unaware that it has found the wrong codeword.

For future reference, I note that the decoder obtains $z$ by changing (complementing) at most one bit in $y$.

Decoders for canonical Hamming codes are always able to determine a valid codeword as the nearest neighbor (in the Hamming metric) to the received word $y$. In contrast, many decoding algorithms (for other codes) fail to decode some possible received words $y$ because the algorithm is unable to determine the nearest codeword. In such cases, an undecodable error pattern is said to have been detected. The decoder usually passes on the received word $y$ to the end user together with an indication that $y$ is not a valid codeword. In many practical systems, a re-transmission is requested whenever such a decoding failure occurs.


A shortened Hamming code of length $2^m-1-s$ is a $(2^m-1-m-s)$-dimensional subspace of the $(2^m-1-m)$-dimensional space constituting the canonical Hamming code. This subspace is usually obtained by setting a fixed number $s$ of data bit positions to $0$. Commonly, for ease of implementation, the first $s$ data bit positions are set to $0$. The transmitted codeword is also shorter by $s$ positions because the first $s$ bits of each (canonical Hamming) codeword are not transmitted. Since the encoder and decoder are designed by a common intelligence, there is no need to transmit the $s$ leading zeroes in each codeword; the decoder can always insert them at the beginning of each received word if it needs them in order to carry out the decoding algorithm. In fact, let us assume for convenience that the decoder inserts the missing zeroes at the beginning of each codeword and then executes the canonical Hamming code decoding algorithm.

If zero or one of the transmitted bits in a codeword from a shortened Hamming code is received in error, the decoder will decode correctly, complementing the bit, if any, that is in error.

If more than one transmission error has occurred, then there are two possibilities to be considered.

  • changing one bit in the received $2^m-1-s$ bits maps $y$ into a codeword $z$ in the shortened code. This is a decoding error, just as with the canonical Hamming code.

  • Changing one bit in the prepended $s$ zero bits would change $000\cdots 0z$ into a codeword $00\cdots 010\cdots 0y$ in the canonical Hamming code. Since these $s$ bits must be $0$, the decoder concludes that an undecodable error pattern occurred in the $2^m-1-s$ bits that were actually transmitted, and this information can be passed on to the end user..

In summary, a decoder for a canonical Hamming code never fails to decode. It always produces a valid codeword as its output. On the other hand, the decoder for a shortened Hamming code can sometimes fail to decode and can signal the end user that the $y$ that it is sending on is not a valid codeword in the shortened code, and that there are at least two bit errors in $y$.

  • 0
    To be honest I don't get that. You say "This is a decoding error, just as with the canonical Hamming code." and then "a decoder for a canonical Hamming code never fails to decode". It seems both of them can fail if more than 1 error pops out, so I don't see the difference apart from the potential alteration of "missing" zeros.2012-09-01
  • 1
    Three things can occur. 1. Successful decoding with correct output. 2. Successful decoding with incorrect output. 3. Failure to decode. Only #1 and #2 can occur with canonical Hamming codes: their decoders never fail to decode. With shortened Hamming codes and with most other codes, all three possibilities can occur. Note that Successful Decoding means that the decoder output is a valid codeword, **not** that the decoder output is the **correct** codeword, that is, the codeword that was actually transmitted. (continued)2012-09-01
  • 0
    Suppose $x$ was transmitted and enough errors occurred that $x$ was changed to another codeword $y$ in the transmission process. The decoder will accept $y$ as a valid codeword, making no changes to it and send it to the end user, who will also have no reason to question the result. Decoders map received sequences into the _nearest_ codeword (if they can find it); they do not claim to map the received word into the _transmitted codeword_; only into the nearest codeword. They try to do _maximum-likelihood decoding;_ the nearest codeword is the most likely to have produced the received $y$.2012-09-01
  • 0
    If both can correct up to one error in every case such that no more actual errors occurred, the feature that shortened code does not always decode is considered to be its advantage, doesn't it? How big can $s$ be?2012-09-01
  • 0
    A shortened code sends less information. $s$ can be as large as $2^m-2-m$ leaving only one data bit and $m$ parity check bits (essentially a repetition code) which has very small probability of error but very low rate too. In many cases, decoding failure is preferred over decoding error, but it is certainly application-dependent, and typically, information theory does not consider decoder failure to exist.2012-09-01
  • 0
    But from what I see, there is no $s$ such that $s$ would turn my form of Hamming code to the canonical one. At times it's equivalent, at times it differs by one.2012-09-05
  • 0
    I think it's worth mentioning/emphasizing that simply cutting off the first $s$ bits of the codeword only works for canonical Hamming codes (i.e., where the generator matrix begins with the identity matrix). In that case, any data word beginning with $s$ zeroes is guaranteed to produce a codeword which also begins with $s$ zeroes. But for non-canonical Hamming codes, this is not necessarily the case. Non-canonical codes can still be shortened, but it is somewhat more complicated.2013-07-31