4
$\begingroup$

Let's say you generate 3 uniformly distributed, independent random numbers on the interval $[0,1]$. Now consider the lengths of the 4 segments made.

What is the probability that the sum of the two medium-length segments is greater than $ 0.5 $?

Example

Let the random numbers be $ 0.5 $, $ 0.3 $, $ 0.1 $. This cuts the interval like so:

|*|**|**|*****|

The sum of the medium-length segments is then $ 0.2 + 0.2 = 0.4 $.

Answer (numerical, no proof)

I ran a computer simulation of this and got the answer: spoiler.

However I can't seem to come up with a derivation of this.

  • 0
    That's an interesting way to hide a spoiler...2011-07-22
  • 0
    Thought of it on the spot ;)2011-07-22
  • 0
    @leonbloy: Not as I read the question. As I understand it, if the random numbers were $0.2,0.4,0.5$, the medium-length segments would be the first two, and the sum of their lengths would (still) be $0.4$. In other words, I take the ordering to be by actual length, not by position in the unit interval.2011-07-22
  • 0
    Yes, it's not clear if we are ordering the points (according to the values) or the segments (acording to their lenghts). OP?2011-07-22
  • 0
    @leonbloy: The formulation is clear, one is ordering the segments. (And ordering the points leads to a numerical value different from the one given by the OP).2011-07-22
  • 0
    @Didier: you're right on both accounts - i'll delete my answer2011-07-22
  • 0
    Just a hint: the generation of the 4 intervals, by throwing 3 points, is equivalent to generating 4 iid exponential and dividing by the sum. Then, the prob asked is equivalent to the probability that, given 4 iid exponentials, their ranked values satisfy $x_1+x_42011-07-22
  • 0
    @leonbloy, this remark indeed leads to a remarkably non computational solution, see my answer below.2011-07-24
  • 1
    @Didier: Yep, I'd already upvoted it :-) Another fact that might be interesting: the event is equivalent to throwing 4 iid uniforms in [0,1] and comparing the geometric mean of the extremes against that of the middle pair.2011-07-24

2 Answers 2

5

Let us try to compute this probability without actually evaluating any integral.

We begin with the remark that one can realize the three random numbers and the number $1$ itself as the four first points of a homogenous Poisson process. In other words, the lengthes of the four intervals are proportional to $(X_i)_{1\le i\le 4}$ where the random variables $X_i$ are i.i.d. and exponential of parameter $1$.

Introducing the order statistics $X_{(i)}$ of the sample $(X_i)$, one asks for $p=P(A)$ with $$ A=[X_{(2)}+X_{(3)}\ge X_{(1)}+X_{(4)}]. $$ Using the notation $X_{(0)}=0$, the waiting time paradox shows that the increments $$ Y_i=X_{(i)}-X_{(i-1)} $$ are independent for $1\le i\le 4$ and that each $Y_i$ is exponential of parameter $5-i$. Since the event of interest is also $$ A=[Y_2\ge Y_4], $$ the probability $p$ is also $$ p=P(Z\ge 3Z'), $$ where $Z$ and $Z'$ are i.i.d. and exponential of parameter $1$. In other words, $p=P(U\le1/4)$ with $$ U=Z'/(Z+Z'). $$ Since $U$ is uniformly distributed in the interval $(0,1)$, $p=1/4.$

Added later on More generally, throwing $n$ points in $(0,1)$ yields $n+1$ intervals. The ordered lengthes of these intervals are proportional to the order statistics $(X_{(i)})_{1\le i\le n+1}$ of an i.i.d. sample $(X_i)_{1\le i\le n+1}$ of exponential random variables. For each $i$, $X_{(i)}=Y_1+\cdots+Y_i$, where the random variables $(Y_i)_{1\le i\le n+1}$ are independent and the distribution of $Y_i$ is exponential with parameter $n+2-i$.

  • 0
    Thanks for the wonderfully simple and concise solution! I'll take a closer look at it later today.2011-07-24
  • 0
    @tskuzzy, thanks for the appreciation. Do not hesitate asking for precisions on some steps if needed, since conciseness is not always a virtue...2011-07-24
  • 0
    Why is it that the $ X_i $ are independent? If we have 2 segments, doesn't the length of the second depend completely on the length of the first? Also, could you elaborate on the how you got the distribution of $ Y_i $? I looked at the time waiting paradox but I don't see how your result follows from it. And finally, is the fact that $ U $ is uniform something that I should just know? Or did you have to derive that for this particular problem? Thanks again!2011-07-25
  • 0
    The keyword here is *proportional*: the lengthes of the consecutive subintervals of $(0,1)$ are distributed like $X_i/(X_1+\cdots+X_4)$, or equivalently, the position of the $i$th point is $(X_1+\cdots+X_i)/(X_1+\cdots+X_4)$. One can rediscover this by elementary computations, or look into any textbook on point processes, for example the one by Brémaud.2011-07-25
  • 0
    Distribution of $Y_i$: for a sample of size $2$, this **is** the waiting time paradox but you can also reprove it since $[Y_1\in dx,Y_2\in dy]$ is $[X_1\in dx,X_2\in x+dy]$ union $[X_2\in dx,X_1\in x+dy]$ and the exponential(2) product exponential(1) distribution follows. For a sample of size $n$, one decomposes $[Y_1\in dx_1,\ldots,Y_n\in dx_n]$ likewise. Try to do it, if there is a problem I will give more details.2011-07-25
  • 0
    Uniform as ratios of exponentials: here, one is entering the realm of beta distributions and yes, I think one ought to know this... :-) Anyway, every beta distribution with integer parameters is the distribution of a ratio $(X_1+\cdots+X_k)/(X_1+\cdots+X_{k+n})$ for i.i.d. exponentials $X_i$s. But here again one can rediscover this fact by an elementary computation (and one ought to perform this computation, and probably several times, if you ask what I think).2011-07-25
4

The lengths of the segments are uniformly distributed over the unit $3$-simplex (see Simulating uniformly on $S^1=\{x \in \mathbb{R}^n \mid \|x\|_1=1\}$). Thus, the desired probability is the fraction of the part of the unit $3$-simplex with $x_1\le x_2\le x_3\le x_4$ for which $x_2+x_3\ge\frac12$.

The unit simplex has volume $1/6$, so the restriction to one particular permutation of the coordinates leads to a volume of $1/6\cdot1/4!=1/144$. We can check this to make sure the integration bounds are set up correctly:

$$\int_0^{1/4}\int_{x_1}^{(1-x_1)/3}\int_{x_2}^{(1-x_1-x_2)/2}\mathrm dx_3\mathrm dx_2\mathrm dx_1=\frac1{144},$$

as computed here.

The integration bounds under the condition $x_2+x_3\ge\frac12$ are a bit trickier, since there are two possibilities, depending on $x_2$. For $x_2\ge\frac14$, all values of $x_3$ with $x_3>x_2$ fulfill the condition, whereas for $x_2<\frac14$ there's a new lower bound $\frac12-x_2$ for $x_3$. Thus we split the integral into two parts at $x_2=\frac14$, with different lower bounds for $x_3$:

$$\int_0^{1/4}\int_{x_1}^{1/4}\int_{1/2-x_2}^{(1-x_1-x_2)/2}\mathrm dx_3\mathrm dx_2\mathrm dx_1=\frac1{768},$$

$$\int_0^{1/4}\int_{1/4}^{(1-x_1)/3}\int_{x_2}^{(1-x_1-x_2)/2}\mathrm dx_3\mathrm dx_2\mathrm dx_1=\frac1{2304},$$

as computed here and here, respectively. So the desired probability is indeed

$$\frac{\frac1{768}+\frac1{2304}}{\frac1{144}}=144\cdot\frac{4}{2304}=\frac{144}{576}=\frac14\;.$$

  • 0
    Very nice! I thought of representing the lengths of the segments as a 3-simplex, but I didn't think that the distribution would be uniform. I was thinking of generalizing the problem to more than 4 segments and arbitrary sums. Your observation makes this general situation significantly more feasible to solve.2011-07-22