1
$\begingroup$

I have a set of items. These items are (pseudo)randomly placed into buckets. The buckets are ordered and items placed in them are ordered. After all of the items are placed in buckets, the items are read back out of the buckets starting with the first item in the first bucket, then the second item in the first bucket, and so one until there are no more items in the bucket. Then the items are read from the second bucket and so on.

For example, I have three buckets, and the items a through e. a is inserted into bucket 2:

1   2   3     a 

b into bucket 1

1   2   3 b   a 

c into bucket 2

1   2   3 b   a     c 

and so on until we have the following buckets

1   2   3 b   a   d e   c     f 

The items are then read out as the list b, e, a, c, f, d.

A naïve examination seems to say that items inserted first have a higher chance of being first and items inserted last have a higher chance of being inserted last. This makes me fairly certain that the list has less entropy than just shuffling the items with an algorithm like fisher-yates, but how do I calculate the difference in entropy?

  • 2
    You're right, it's not uniformly random: for instance, assuming that the number of buckets is less than the number of items, the descending permutation (like `f e d c b a`) has zero probability of being the final list.2011-07-14
  • 1
    If your final list marks the buckets separately, e.g. you write it as `be, acf, d`, then each list corresponds to a unique map from the $n$ items to the $m$ buckets, so the entropy is easy to calculate: it's $\log n! \approx n\log n$ for the original case, versus $\log m^n = n\log m$ for your setup. But if you don't, then each list can correspond to multiple maps (e.g. `a b c e f d` may be either `a, bcef, d` or `ab, cef, d` or `abc, ef, d`) and the entropy is harder to calculate. It's definitely less than $n\log m$, though, as there are at most $m^n$ achievable permutations.2011-07-14

1 Answers 1