3
$\begingroup$

I ask this in a partly recreational, and partly research-related spirit, and I realize my problem might be ill-posed, so any suggestions for clarification might go a long way.

Succinctly, my problem can be stated as

Find the period of a "black box" pseudorandom number generator

but I suppose the following formulation might be more attractive:

Most programs for playing MP3 files (and MP3-playing devices as well) possess a "shuffle" feature that allows the player to play a list of songs in essentially "random" order.

If you run a music player set to shuffle long enough, you will note that sometimes the time interval between the instances your favorite song in the playlist can be short, or can be long.

Now, for a number of reasons, you can't take a look at what pseudorandom number generator your music player is using; you are however interested in how frequently your favorite song in the playlist would be played.

So, one now asks,

  1. Can you determine how often your favorite song in the playlist would be played based on the time intervals between the different instances your song is played?

  2. How long should your playlist be? Obviously, you can't conclude much if you only have two songs in your playlist...

  3. How long should you leave your music player playing to answer question number 1?

I should probably make the drastic assumption that all the songs in the playlist have the same length, e.g. 3 minutes or so.

  • 1
    Not necesssarily looking at the source code, but making some assumptions about the PRNG and how the music shuffle is chosen. For example you have not observed the same song playing twice in a row. There may well be code that asks for "a do over" if the random choice comes up the same as the last choice, or this "feature" may be implemented in another way. If, for example, it was observed that the same song is never played within three times (so long as the playlist has four or more songs), this would suggest the music player tracks the last three songs.2010-11-23

1 Answers 1

2

Here's a practical procedure, which assumes that the Shuffler is cyclic. Construct some hash function $H$ from $k$-tuples of songs to $n$ bits (the parameters $k,n$ will be decided upon later). Store the value of $H$ on the first $k$ songs, and look for a matching value of $H$ on a future $k$-tuple. If $k$ is large enough, then the expected number of occurrences of this $k$-tuple in the entire cycle is $<1$, and so its likely that a repeat means that you've reached the end of the cycle. If $n$ is big enough (larger than $\sqrt{N}$, where $N$ is the period), then it's probable that matching hashes means matching $k$-tuples.

In order to increase your certainty, you can store a much larger prefix, and then use it whenever the hash value matches. In fact, you can probably use several hash functions this way to optimize the procedure.

If you can't assume that the Shuffler is cyclic, but rather that from some point on it will cycle, you can store all the hashes in a hash table (along with their time) and wait for a hash that appears twice. If you're allowed to read the input twice, you can use any of the Pollard rho algorithms, as hardmath commented.