1
$\begingroup$

we randomly pick $M$ elements from a set of $N$ real numbers, $A=\{ a_1,a_2,\ldots,a_N \}$. Then we sort these $M$ numbers, what is the expected value for the largest element? (let say $N=1000000$, $M=1000$), I am interested in the solution for general $A$, and/or the case that elements of $A$ are taken from a normal distribution.

  • 0
    Are the $M$ elements a subset of $A$ or are you sampling with replacement? For sampling with replacement, assuming that a_1 < a_2 < \cdots < a_N, $P\{\max = a_i\} = \left(\frac{i}{N}\right)^M - \left(\frac{i-1}{N}\right)^M$ from which you can find $E[\max] = \sum_{i=1}^N a_i\left[\left(\frac{i}{N}\right)^M - \left(\frac{i-1}{N}\right)^M\right].$2012-03-08

3 Answers 3

2

Note: Since you're concerned with subsets, I'm going to assume that you're choosing your elements without repetition. As per your question, I'll also assume that you're picking your elements with uniform probability (ie completely at random).

Adding a bit of notation, suppose that our chosen subset is $B=\{b_1,b_2,\cdots,b_M\}$ with $b_1. If we assume also that $a_1, then clearly $P(b_M, since the smallest $M$ elements that we can choose from $A$ are $\{a_1,a_2,\cdots,a_M\}$.

Now, $P(b_M=a_M)=\frac{1}{{N \choose M}}$, since the only possible such subset is $\{a_1,a_2,\cdots, a_M\}$.

To compute $P(b_M=a_{M+1})$, the only way this can happen is if $\{b_1,\cdots, b_{M-1}\}\subset \{a_1,a_2,\cdots, a_M\}$, hence $P(b_M=a_{M+1})=\frac{{M \choose M-1}}{{N \choose M}}$.

And by a similar argument, for $1\leq i \leq N-M$, $P(b_M=a_{M+i})=\frac{{M +(i-1) \choose {M-1}}}{{N \choose M}}$.

So, from there, the expected value is $\frac{1}{{N \choose M}}\sum\limits_{i=0}^{N-M}a_i {{M+(i-1)} \choose {M-1}}$

1

If you want to pick $M$ random numbers, you can pick the highest one $m$ in $N$ ways and then the rest in $\binom{m-1}{M-1}$: $\binom{N}{M} = \sum_m\binom{m-1}{M-1}\,,$ so the probability that $m$ is the maximum is $\binom{m-1}{M-1}\Big/\binom{N}{M}\,.$

  • 0
    Ah, of course. I didn't know the notation $(n \mbox{ over } k)$ was defined when n.2012-03-12
0

Hint 1: For every nonnegative random variable $X$, $\mathrm E(X)=\int\limits_0^{+\infty}\mathrm P(X\geqslant x)\mathrm dx$.

Hint 2: If $M=\max\{\xi_1,\xi_2,\ldots,\xi_n\}$, then, for every $x$, $[M\leqslant x]=\bigcap\limits_{k=1}^n[\xi_k\leqslant x]$.

  • 0
    well, I posted a new question, but it looks like others thought it was exactly same question, so it was deleted.. Anyways I appreciate your time and help with this question.2012-03-09