3
$\begingroup$

Suppose we have a row of $c\,W$ elements. We select half of them, $\displaystyle \frac{c\,W}{2}$. What is the probability that we have at least one run of at least $W$ selected elements, in terms of $c$ and $W$?

EDIT: $c$ and $W$ are both positive integers.

  • 0
    What is the distribution of the elements? Are they uniform? Normal? Do you shuffle the elements when you select them, or keep them in the original order? I'm asking because e.g. if they are such that x_i when i and you keep the original order, then the probability of a run of $W$ elements is 1 whenever $c\geq 2$.2011-06-10

1 Answers 1

1

The following assumes that $W$ and $a$ are positive integers, that one chooses uniformly at random $aW$ integers between $1$ and $2aW$, and that one asks for the probability of the event $A$ that the resulting set $S$ contains at least $W$ consecutive integers.

Let us compute the probability of the event $A_i$ that $S$ contains the $W$ consecutive integers starting from $i$. Then $A_i=\emptyset$ if $i\ge(2a-1)W+2$, otherwise $S$ is determined by its $(a-1)W$ elements not between $i$ and $i+W-1$. Hence $ P(A_i)=\frac{{(2a-1)W\choose (a-1)W}}{{2aW\choose aW}}=p(W). $ Similar formulas can be written for $P(A_i\cap A_j)$ for example, depending of $|i-j|$.

Since $A$ is the union of the events $A_i$, an exact formula for $P(A)$ is given by the exclusion-inclusion formula, and an easy upper bound is $ P(A)\le(2a-1)Wp(W). $ When $W\to+\infty$, Stirling formula yields $ p(W)=\sqrt{\frac{2a-1}{2a-2}}(C_a)^W(1+o(1)),\quad C_a=(2a-1)^{2a-1}2^{-2a}a^{-a}(a-1)^{1-a}. $ Roughly speaking, the error made in the uper bound is to count several times the segments of length greater than $W$ included in $S$. This remark and some heuristic arguments suggest that a discount of $\displaystyle\frac{a}{2a-1}$ should be applied to the upper bound, hence $ P(A)=a\sqrt{\frac{2a-1}{2a-2}}W(C_a)^W(1+o(1)). $