3
$\begingroup$

Why is it that points in a Cantor set do not require less storage space (encoding bits) than points in a Cantor set with larger Hausdorff dimension?

If 2 Cantor sets have different Hausdorff dimension, then encoding set's points (on a computer file) should require less storage space on the set with lower dimension, same as encoding points in the plane requires double space compared to encoding the same number of points on a line.

So, storage space should depend on the dimension of the set containing the points.

But each point on each Cantor set is equivalent to a point in the other set, so they should require the same storage space to be encoded.

The Cantor set has a tree structure, then encoding a specific point in the tree should not depend on the Hausdorff dimension. A binary string of bits is all that's needed (0 for left branching, and 1 for right branching).

The only difference meanwhile encoding the same point on different Cantor sets should be writing the dimension, but that data is only written once per set, and has no effect on the storage space of each individual point.

Cantor set has a tree structure

^On this image, storing all the 32 points in the base of the tree requires 32 strings of 5 bits each one, no matter what is the Hausdorff dimension.

If it were an image, I can see that the lower dimensioned set would occupy less percentage of pixels, but that's because information is lost under the pixel size. A similar case is when the coordinates are stored on integer format: on lower dimensions there are more blank spaces, but locating the points require higher precision to avoid losing information, so the storage space saved non encoding blank spaces, is compensated with the extra storage needed for precision.

(I'm a programmer, and not a mathematician, so please, avoid using too much symbolism if possible).

  • 0
    Very nice image! Did you make it?2017-01-06
  • 0
    @The Count: No. Is a direct link to an image on this wikipedia article: https://en.wikipedia.org/wiki/Cantor_set2017-01-06
  • 0
    What, exactly, is your question?2017-01-06
  • 0
    @Mark McClure I'm confused about why points in a cantor set do not require less storage space (encoding bits) than points in a real line.2017-01-09

1 Answers 1

1

In order to relate fractal dimension of a set to storage space required for the set, you'll also need to specify the accuracy to which you are working. For simplicity, let's suppose we are working to $n$ bit accuracy and not worry about the finer points of floating point arithmetic. Then, there are $2^n$ such numbers and we require $n$ bits to store each number for a total of $n\times 2^n$ bits of storage required.

Now, suppose we want to store Cantor's ternary set to the same accuracy. Of course, it's more natural to think of that set in base 3; in fact, none of those points have a finite binary expansion but, using a standard rounding procedure, we're looking for the $n$-bit binary numbers that are closest to the Cantor set. Well, two points in the $m^{\text{th}}$ level approximation to the Cantor set can be a minimum distance $3^{-m}$ apart. Thus, to determine how many points from our set of binary numbers we need, we must consider the inequality $$3^{-m} < 2^{-n}.$$ Once $m$ is large enough to make this true we won't generate any new points in the Cantor set closer to our $n$-bit number set. Put another way, we only need $n\times 2^m$ bits to store the Cantor set, where $m$ is the smallest integer such that $$m>n\frac{\log(2)}{\log(3)}.$$ We just so happen to see the fractal dimension of the Cantor set in this formula.


Note that the same considerations apply if we compare the storage space required for two self-similar Cantor sets. Suppose that such a set consists of two copies of itself scaled by the factor $r$ and we wish to store it to an accuracy of $n$ bits. We now need to consider the inequality $$r^m < 2^{-n}.$$ As is well known, though, the dimension is related to this scaling factor by the formula $$1/r = 2^{1/d}.$$

For concreteness sake, suppose we consider the storage required required for a Cantor set of dimension $d_1 = 1/2$ compared to a set of dimension $d_2 = 1/4$. The second set is smaller and we expect it to require less storage space.

Note that the scaling factors for the two sets are different. For the first set it is $r_1 = 1/4$, since $$1/(1/4) = 4 = 2^{1/(1/2)}.$$ Thus, we're led to the inequality $$1/4^m<2^{-n}$$ or $m>n/2$. Thus, we need to compute the tree to level $n/2$. For the second set, $r_1 = 1/16$ and the inequality $$1/16^m<2^{-n}$$ leads to the solution $m>n/4$. Thus, we only need to compute the tree to level $n/4$.

  • 0
    I split this comment in 2 parts due to size limitations on the number of characters I'm allowed to post. The problem is that a Cantor set can have other dimensions. For example, if the central third is removed at each iteration, then the dimension is $\frac{ln(2)}{ln(3)}\approx 0.63$, but if half of the line is removed, then the dimension is $\frac{ln(2)}{ln(4)}=0.5$2017-01-10
  • 0
    But despite having different dimensions, each cantor set is equivalent to each other (I'm not sure if is correct to say that both are "topologically" equivalent), in the sense that there is a bijective relation between each point in one set to the "same" point in the other. For every point in one set, there is an unique point in the other set, occupying the "same" place in the tree.2017-01-10
  • 0
    As consequence, any encoding for one point in cantor set, also encode the "same" point in the other set, no matter what the dimension is. ![enter image description here](http://i.imgur.com/UYjyiDS.jpg) You proposed an example, for the cantor set with dimension $\frac{ln(2)}{ln(3)}\approx 0.63$ which encodes one point in that cantor set (or a group of points inside the error accuracy). Because the bijective relation between points, that exact encoding also encodes the same point (or the same group of points) on the Cantor set with dimension 0.5 (or any dimension between 0 and 1).2017-01-10
  • 0
    In fact, the encoding can be made just by storing the position in the tree. In the image above the point is the one at "left, right, right", at level $m=3$, or a binary string $011$, and that works for any dimension. It also encodes any point at $m>3$, between $0110$ and $0111$, within a precision of $0001$2017-01-10
  • 0
    That means that the length of the encoding does not depends on the dimension of the cantor set. That's is the problem.2017-01-10
  • 0
    @cezudidu Again, the issue that you are not considering is the depth to which the tree representation of the Cantor set needs to be computed to store the set to a specified accuracy. This is related to the dimension of the set. I've edited my answer to illustrate the issue for a pair of Cantor sets.2017-01-10
  • 0
    I accepted your answer. You clarified to me than on integer dimensions, to encode a number to n digits in base $\epsilon$ is the same as to encode to up n tree levels, where at each level the number of branches multiply by $\epsilon$ tree depth. For example, the real line is commonly codded in base 10, which means that at each tree level, exactly one decimal digit is added, encoding 1 out of 10 branches.2017-01-12
  • 0
    To say "the same accuracy" on another dimension is to say "for the same $\epsilon$, no matter what the structure of the set is at that dimension", and by definition of Hausdorff dimension, at that scale, there are $\frac{log(\epsilon^D)}{log(\epsilon)}=D$ digits in base $\epsilon$ to be added to encode the point. I remain puzzled on encoding to perfect accuracy, which do not depends on the dimension, (but requires a termination symbol for each point).2017-01-12
  • 0
    @cezudidu I don't know what you mean by "perfect accuracy". Any computer has a finite memory and can only represent finitely many numbers. The sets you are trying to store are infinite. By "storage", I guess I'm assuming that you mean - "how much space is required to store the encodings of all the representable numbers in your set?" You simply can't encode all of them so I don't see how you obtain "perfect accuracy".2017-01-12