0
$\begingroup$

I have a theory that in order to best compress 1D data between 0 and 1 in a lossy way, you should map each data point $i$ to:

$\int_0^i \! P(x) \, \mathrm{d}x$

Where $P(x)$ is the probability density function of the data. The probability density function over the mapped data should be level and being level when you store the values to a finite precision, the amount of redundant information should be minimized.

My question is: how can I generalize this to more dimensions? Say I have a set of points which fit a given distribution and I want to compress points that fit the same distribution.

My initial assessment is that I should incrementally move the points so that they follow a level probability density function, then (with lots of arm waving) it should be possible to traverse the mapping using the proximity to the reference points.

A further constraint is that given data that fits something that looks like a spiral, rather than squashing it together, it should unwind it.

Can anyone shed any light on this?

  • I assume that I'm not the first person to talk about such a concept. What is its name?
  • Is this a good approach?
  • How do you incrementally move the points so that they fit a level probability density function?
  • How do you traverse the mapping?
  • How do I modify the above approach to fit the added constraint?

edit: To clarify, essentially what I want to do is take a set of points, then warp the space the points are in so that the points are equally spaced.

  • 2
    The map in your first paragraph is called the [probability integral transform](http://en.wikipedia.org/wiki/Probability_integral_transform).2011-01-28

1 Answers 1

1

If I understand it right, what you are trying to do is conceptually similar to vector quantization: this tries to partition the N-dimensional space in zones that measure (in probability) the same, so that you can assign to each zone a representative value (which would corespond to the quantized/coded value).