0
$\begingroup$

Given the 16bit Pearson hash of a 112 bit message, how many other messages have the same hash ?

What's the probability that a similar 112b message of a given 112b message (you can define similar however you want, for example so that no more than 1/4 of its bits have been altered) has the same 16bit Pearson hash value ?

For the second question, what would be the result if it was a CRC-16-CCITT (the polynomial is x^16+x^12+x^5+1) and a 8KB message instead (is CRC-16-CCITT a good checksum for messages of length up to 8KB) ?

1 Answers 1

1

If you accept that the hash is a good one, the messages will be evenly distributed among the hashes. So there are $2^{112}$ messages and $2^{16}$ hashes. Each hash then corresponds to $2^{112-16}=2^{96}$ messages.

Again, if the hash is a good one, altering a few bits (greater than $\log_2$ of the length of the hash) gives you about the same probability of collision as a random message. You should be guaranteed that altering one or two bits will produce a different hash.

Your third question is not well defined. One can apply that polynomial to any length of data. As the CRC result is 16 bits long, there are a limited number of hashes. If the purpose of the checksum is to detect errors, how many errors do you need to detect? That is, what is the criterion for a good checksum.

These presume the hash is well chosen. If you use the RM hash (which I just made up), it consists of the last 16 bits of the data. It will make a mockery of these calculations, which is one indication of why it is not a good hash.