3
$\begingroup$

A premier B-school has 2009 students.The dean,a math enthusiast,asks each student to submit a randomly chosen number between 0 and 1.She then ranks these numbers in a list of decreasing order and decides to use the 456th largest number as a fraction of students that are going to get an overall pass grade this year.What is the expected fraction of students that get a passing grade?

I am not able to think in any direction. As it is really difficult to comprehend.

  • 0
    I don't like the use of the word "fraction" here, unless the pupils are assumed to submit rational numbers... I suppose that's plausible too!2012-09-14
  • 0
    Isnt this a stochastic process2012-09-14
  • 5
    Everyone should submit 1.2012-09-14
  • 3
    A more useful title would be useful.2012-09-14

2 Answers 2

0

A naive approximation would be: Assume the $N=2009$ given numbers are equidistant within $[0,1]$. The the $k$th smallest should be about $\frac{k-\frac12}N$ and hence the 456th largest = 1554th smalles $\approx \frac{1544-\frac12}{2009}=\frac{63}{82}$. But this reasoning is a bit handwaving.

To be precise you should evaluate for $x\in[0,1]$: What is the probability $p(x)$ that exactly 1553 guesses are $ and exactly 456 guesses are $\ge x$? Then the expected value we want is $\int_0^1 x p(x) dx$. Try this latter method for some smaller values of $N$ and $k$ and see how much it dieffers from the straightforward guess $\frac{k-\frac12}N$.

  • 0
    +1 The approximated result of $\frac{63}{82} \approx 0.7683$ is only 0.6% off of the exact one $\frac{259}{335} \approx 0.77313$.2012-09-14
  • 0
    The probability $p(x)$ is not the correct density function. In fact it is too small by about a factor of $N$. Consider the case where $k=N$. Then your $p(x)$ is $x^{N-1}$ and this integrates to $1/N$ when it should be close to $1$.2012-09-14
  • 1
    $(k-\frac12)/N$ is not correct. If they're uniformly distibuted and independent, then, on average, the $k$th-smallest one is $k/(N+1)$.2012-09-14
  • 0
    How it is k - 1/22012-09-14
  • 1
    How is the guess "straightforward"? If I pick five numbers between $0$ and $1$ and make them "equidistant", I could get $0.4, 0.401, 0.402, 0.403, 0.404$, but it's not plausible that that's what was meant. If I pick two numbers at random, why would it be more intuitively plausible that the three parts into which they divide the interval would be of unequal lengths than that they would be of equal lengths, so that the averages are $1/3$ and $2/3$?2012-09-14
  • 0
    +1 I agree with Sasha. His analysis considers the exact distribtuin of the 456 ordered value out of 2009 randomly chosen samples (iid) from U[0,1]. So he can compute the exact expected value for that order statistic after deriving its exact probability distribution. That taking a non random uniform spacing of points (perhaps a somewhat intuitive approximation) is that close to the exact answer is somewhat remarkable.2012-09-14
  • 0
    @Michael Hardy: Well, maybe "straightforward" was the wrongword instead of "simplistic". My idea was rather to place $N$ equidistant points on a *circle* and then cut the circle apart in the middle of one of the created (equal) arcs.2012-09-14
7

This is a question on order statistics. Let $U_i$ denote independent random variables, uniformly distributed on unit interval. The teacher picks $m$-th largest, or $n+1-m$-th smallest number in the sample $\{U_1, U_2, \ldots, U_n\}$, which is denoted as $U_{n-m+1:n}$. It is easy to evaluate the cumulative distribution function of $U_{n-m+1:n}$, for $0 $$\begin{eqnarray} \mathbb{P}\left(U_{n-m+1:n} \leqslant u \right) &=& \sum_{k=n-m+1}^n \mathbb{P}\left( U_{1:n} \leqslant u, \ldots, U_{k:n} \leqslant u, U_{k+1:n} >u, \ldots, U_{n:n} >u \right) \\ &=& \mathbb{P}\left( \sum_{k=1}^n [ U_k \leqslant u] \geqslant n-m+1\right) \end{eqnarray} $$ where $[U_k \leqslant u]$ denotes the Iverson bracket. It equals 1 is the condition holds, and zero otherwise. Since $U_k$ are independent, $[U_k \leqslant u]$ are independent identically distributed $0-1$ random variables: $$ \mathbb{E}\left( [ U_k \leqslant u] \right) = \mathbb{P}\left(U_k \leqslant u\right) = u $$ The sum of $n$ iid Bernoulli random variables equals in distribution to a binomial random variable, with parameters $n$ and $u$. Thus: $$ F(u) = \mathbb{P}\left(U_{n-m+1:n} \leqslant u \right) = \sum_{k=n-m+1}^n \binom{n}{k} u^{k} (1-u)^{n-k} $$ The mean can be computed by integrating the above: $$\begin{eqnarray} \mathbb{E}\left(U_{n-m+1:n}\right) &=& \int_0^1 u F^\prime(u) \mathrm{d}u = \left. u F(u) \right|_{u=0}^{u=1} - \int_0^1 F(u) \mathrm{d} u \\ &=& 1- \sum_{k=n-m+1}^n \binom{n}{k} \int_0^1 u^{k} (1-u)^{n-k} \mathrm{d} u \\ &=& 1 - \sum_{k=n-m+1}^n \binom{n}{k} B(k+1, n-k+1) \\ &=& 1 - \sum_{k=n-m+1}^n \frac{n!}{k! (n-k)!} \cdot \frac{(k)! (n-k)!}{(n+1)!} \\ &=& 1 - \sum_{k=n-m+1}^n \frac{1}{n+1} = 1 - \frac{m}{n+1} \end{eqnarray} $$ Using $n=2009$ and $m=456$ the exact fraction equals: $$ \left.\mathbb{E}\left(U_{n-m+1:n}\right)\right|_{n=2009,m=456} = \frac{259}{335} \approx 0.77313 $$

  • 2
    I like Hagen's answer but I think Sasha should get the checkmark for getting the exact answer. I realize that Hagen came first. But perhaps the OP should have waited to see what other answer might come up. sasha's came only 22 minutes later. @ArpitBajpai you can change your decision if you are so inclined.2012-09-14
  • 0
    @Michael: I hadn't expected that the exact calculations could be done so nicely. I second the request to move the check mark here.2012-09-14
  • 0
    Hagen you are a really good sport about this! Nevertheless I think your answer was really clever.2012-09-14