2
$\begingroup$

Suppose we have a $n$ balls in an urn labeled $1$ through $n$, and we draw balls without replacement. Suppose we draw a first ball , and then draw an additional $k$ balls uniformly at random without replacement. What is the expected number of balls which have a label larger than the label of the first ball?

3 Answers 3

3

If the first ball is number $i$, the urn is left with $i-1$ balls with smaller numbers and $n-i$ balls with larger numbers. The expected number in this case is clearly $k\frac{n-i}{n-1}$, provided that $k\le n-1$. The $n$ possible values of $i$ are equally likely, so the overall expected value is $$\frac1n\sum_{i=1}^n\frac{k(n-i)}{n-1}=\frac{k}{n(n-1)}\sum_{i=1}^n(n-i)=\frac{k}{n(n-1)}\sum_{i=0}^{n-1}i=\frac{k}{n(n-1)}\cdot\frac{n(n-1)}2=\frac{k}2\;.$$

You can see this even more easily by imagining that you draw the last $k$ balls first and then draw the ‘first’ ball: by symmetry it must on average be in the middle of the $k$ balls.

  • 0
    The expected number is not (at least to me) clearly $k \frac{n-i}{n-1}$, although I do see that the probability of drawing the first good ball is $\frac{n-i}{n-1}$. It's not immediately obvious to me that drawing without replacement doesn't change the expectation.2012-04-26
  • 0
    @Joe: You’re picking a randomly chosen $k$ of the remaining $n-1$ balls, and you might as well imagine picking all of them at once, since the draw is without replacement. On average you expect them to reflect the $i-1:n-i$ split in the urn, so you expect $\frac{i-1}{n-1}$ of them to be below $i$ and $\frac{n-i}{n-1}$ of them to be above $i$. Since you’re drawing $k$ of them, you just muliply those fractions by $k$.2012-04-26
  • 0
    When randomly selecting items from a population (either with or without replacement), the distribution of the $m$'th item selected is the same as that of the first. The conditional distribution of the $m$'th item, given the first, is the same as the conditional distribution of the second item, given the first.2012-04-26
6

Hint: what is the probability that ball #$j$ has a label larger than that of ball #$1$?

  • 0
    with replacement, the answer is easy; without replacement, it seems to depend on the labels of balls $2$ through $j-1$2012-04-26
  • 1
    No, it doesn't. The other balls are irrelevant. All that matters is that you're drawing two different balls (#$1$ and #$j$) and that the two possible orderings of any particular pair are equally likely.2012-04-26
  • 1
    @Joe: I suggest to make sure you understand this answer. It happens quite often that questions such as yours about expectation values can be answered using [linearity of expectation](http://en.wikipedia.org/wiki/Expected_value#Linearity) without having to worry about any of the complications caused by correlations between different draws. Note how much simpler Robert's solution is than the others.2012-04-26
  • 0
    @joriki ...with an indicator variable for each ball $j$, and the expected number of balls with labels greater than that of ball $1$ is the sum of the expectations of the indicator variables...2012-04-26
  • 0
    @Joe: Is that a question directed at me? If so, I'm afraid I don't understand it.2012-04-27
4

Let $X$ denote the number on the first ball. Then $$ \mathbb{P}(X=x) = \frac{1}{n} $$ Conditioned on the number on the first ball, the remaining $n-1$ balls are split into two sets, those below $x$, there are $x-1$ of these, and those above, and there are $n-x$ of those. The number $Y|X$ of those above $x$ in the sample of size $k$ drawn without replaced follows hypergeometric distribution $\operatorname{Hyp}(k, n-x, n-1)$ with mean $$ \mathbb{E}(Y|X) = \frac{k(n-x)}{n-1} $$ Thus $$ \mathbb{E}(Y) = \mathbb{E}(\mathbb{E}(Y|X)) = \sum_{x=1}^n \frac{1}{n} \cdot \frac{k(n-x)}{n-1} = \frac{k}{n(n-1)} \sum_{x=0}^{n-1} x = \frac{k}{2} $$

  • 0
    I knew there had to be a distribution for this. Bonus, the combinatorial definition of the distribution is even intuitive. http://en.wikipedia.org/wiki/Hypergeometric_distribution2012-04-26