1
$\begingroup$

I am looking for a mapping to encode a n-bits information into n-x+m bits.

  • The (n-x) bits need to be as uniform as posible, and I can accept a small mount of data to achieve this (such as a nxn matrix).
  • The m bits are free to be in any distribution.

n,x,m are fixed. 0=0.

In fact, I want a hash function that encode some of the string into the hash value so that I can save some space from keeping the whole string.

I read that any invertible matrix can form a bijection linear transformation, but it is not a uniform one.

And, since it is to be used as "hash", space and time are importent too.

  • 0
    It's not very clear... Aparently you want some invertible function that maps a string $s$ (length $n$) to a pair $[s_1,s_2]$ (lengths $n_1,n_2$, with $n_1 < n$) so that $s_1$ is statistically "uniform" - and I presume you want $n_1$ is fixed, independent of $n$. Is that so?2011-07-23
  • 0
    Yes. n1 need to have a fixed relationship to n. Such as n1=n-2;2011-07-23
  • 0
    Two examples - probably dumb, they surely they wont fit your need, but you'd tell us why, your problem is not well posed. 1) take $s_1$ as a (say) MD5 hash of $s$, trim it (or pad with noise) to $n_1$ bits; append the full original string ($s_2 = s$). 2) compress the string (huffman, zip, etc) and split it into two parts (first of length $n_1$).2011-07-23
  • 0
    I just add `02011-07-23
  • 0
    reversible? If $x > m$ then it can't be reversible by Pigeonhole principle.2011-07-23

1 Answers 1

2

Hmm... If I understand you correctly, here's something that might work:

Let $H$ be a hash function taking an arbitrary string as input and producing an $n$-bit string as output. Given an input string $s$, let $s_0$ denote the first $n$ bits of $s$ and $s_1$ the rest (i.e. $s = s_0\;||\;s_1$, where $s_0$ is $n$ bits long and $||$ denotes concatenation). Then define $$f(s) = (s_0 \oplus H(s_1))\;||\;s_1,$$ where $\oplus$ means bitwise XOR. It's easy to see that $f$ is its own inverse, i.e.$f(f(s)) = s$, so that the input string $s$ can be recovered from $f(s)$ just by running it through $f$ again.

  • 0
    Seems good. Random is privided by H(s1) and, might be kept after XOR. I am not so sure about this point. However, this is a good starting point.2011-07-23
  • 0
    If XOR is not good enough, you could replace it with some other parametrized, invertible bijection $E_K$ to get $f(s) = E_{H(s_1)}(s_0)\;||\;s_1$. To invert $f$, replace $E_K$ with its inverse $E_K^{-1}$. In particular, a block cipher with the key $K$ will do for $E_K$, although that might be considered overkill.2011-07-23