0
$\begingroup$

Suppose a set of $m$ integers from $0$ to $n-1$. The integers are uniformly distributed and unique in the set ($n \gg m$). Then, put all the integers into a list an sort that list: $x_0 < x_1 < \dots{} < x_{m-1}$

What is the probability distribution of $x_i$, and of $d_i := x_{i+1} - x_i$ ?

I stumbled on a very similar question, but it only discusses the minimal and average value of $d_i$. It does however present an interesting way to think of it:

Give a rope of size $n$, I cut it on $m$ places chosen uniformly at random (such that, in my case, no two cuts are done at the same spot). What is the probability distribution of the position of the $i^{th}$ cut ? What is the probability distribution of the size of the rope bits ?

I ran some simulations, and $d_i$ seems to have an exponential, maybe Gamma distribution (which is intuitive), and $x_i$ is bell-shaped (probably hypergeometric, which again is rather intuitive) but I cannot figure it out on paper. Any thoughts ?

  • 0
    The elements are also unique in the other question; that's what "without replacement" means.2012-11-30

1 Answers 1

2

If $x_i=x$, then $x_0,\dotsc,x_{i-1}\in\{0,\dotsc,x-1\}$ and $x_{i+1},\dotsc,x_{m-1}\in\{x+1,\dotsc,n-1\}$. There are $\binom xi\binom{n-x-1}{m-i-1}$ different choices, so the probability is

$ P(x_i=x)=\frac{\binom xi\binom{n-x-1}{m-i-1}}{\binom nm}\;. $

For the distribution of $d_i$, add another option $n$ and another unique integer $x_m$, consider the options $0$ to $n$ arranged in a circle, cut the circle by removing $x_m$ and renumber the remaining options beginning with $0$ next to the cut and ending with $n-1$ on the other side of the cut.

What remains is a linear arrangement distributed exactly as in your scenario, but the symmetry we had before we broke it by removing $x_m$ allows us to conclude that all $m+1$ runs of unused options are on the same footing and thus have the same distribution.

But the distribution of the first run of unused options, the ones less than $x_0$, is easy to determine: The probability that $x_0=d-1$ and the remaining $m-1$ numbers are greater is

$ P(d_i=d)=\frac{\binom{n-d}{m-1}}{\binom nm} $

for $1\le d\le n-m+1$ and $0$ otherwise.

  • 0
    Indeed, thanks for this great answer :-)2012-11-30