9
$\begingroup$

Suppose $S$ is a set of $n$ points in a plane. A point is called maximal (or Pareto-optimal) if no other point in $S$ is both above and to the right of that point.

If each point in $S$ is chosen independently and uniformly at random from the unit square $[0,1]\times [1,0]$. What is the exact expected number of Pareto-optimal points in $S$?

  • 0
    Nice question, which received a nice answer.2012-11-29

2 Answers 2

5

Label the n points $p_1,…,p_n$ in descending order of their y coordinates. $p_1$ is maximal since it is above all other points, and its expected contribution to the total number of maximal points is therefore 1.

$p_2$ is above points $p_3,…,p_n$ but below $p_1$, and is therefore maximal iff it is to the right of $p_1$. Since the distribution of $p_1$ and $p_2$ in the x dimension is random and independent of that in the y dimension, it is equally likely that either one of them will be to the right of the other. The probability that $p_2$ is maximal is therefore $\frac 1 2$, and its expected contribution to the total number of maximal points is $\frac 1 2$ x 1 = $\frac 1 2$.

Now consider any point $p_m$. Generalising from the case of $p_2$, $p_m$ is above points $p_{m+1},…,p_n$ but below points $p_1,…,p_{m-1}$, and is therefore maximal iff it is to the right of all of $p_1,…,p_{m-1}$. Since the distribution of $p_1,…,p_m$ in the x dimension is random and independent of that in the y dimension, it is equally likely that any one of them will be to the right of all the others. The probability that $p_m$ is maximal is therefore $\frac 1 m$, and its expected contribution to the total number of maximal points is $\frac 1 m$ x 1 = $\frac 1 m$.

The expected number of maximal points is therefore 1 + $\frac 1 2$ … + $\frac 1 n = H_n$, the $n$th Harmonic number.

  • 0
    This seems so intuitive, I got to the 1/2 by myself but couldn't generalize, thank you for this @AdamBailey2017-10-29
2

Let $I_i$ be the indicator random variable denoting whether the ith point in $S$ is pareto optimal. Then the number of pareto-optimal points in $S$ is $M = \sum_{i=1}^{n} I_i$ and $E[M] = \sum_{i=1}^n E[I_i]$ by the linearity of expectation. Since $I_i$ is a binary valued random variable, we also have $E[I_i] = P(I_i = 1)$. Putting this together, we get $E[M] = \sum_{i=1}^n P(I_i = 1)$.

To evaluate $P(I_i=1)$, we need to find out the probability that there is no point above and to the right of the ith point. This can be done by first conditioning on the ith point's location within the unit square and then demanding that all the remaining $(n-1)$ points not lie in the smaller rectangle above and to its right. If the ith point is $(x_i,y_i)$, we want none of the remaining $(n-1)$ points in the rectangle of sides $1-x_i$ and $1-y_i$. The probability that a random point lies in this rectangle is simply its area and the probability that none of the $(n-1)$ points lie in this rectangle is therefore $(1-(1-x_i)(1-y_i))^{n-1}$.

\begin{eqnarray} P(I_i=1 \mid \text{ith pt} = (x_i,y_i)) &=& (1-(1-x_i)(1-y_i))^{n-1} \\ P(I_i=1) &=& \int_0^1 \int_0^1 (1-(1-x_i)(1-y_i))^{n-1} dx_i dy_i \\ &=& \int_0^1 \int_0^1 (1-uv)^{n-1} du dv \end{eqnarray}

This integral is pretty easy to compute. By holding $v$ constant, we can first evaluate the inner integral as

$\int_0^1 (1-uv)^{n-1} du = \frac{1}{nv} (1-(1-v)^{n})$

Making the substitution $v \rightarrow 1-v$, we get

\begin{eqnarray} P(I_i = 1) &=& \frac{1}{n} \int_0^1 \frac{1-v^n}{1-v} dv \\ &=& \frac{1}{n} \int_0^1 (1+v+\dots+v^{n-1}) dv \\ &=& \frac{1}{n} \left(1+\frac{1}{2}+\dots+\frac{1}{n}\right) = \frac{H_n}{n} \end{eqnarray}

Finally, we compute $E[M] = \sum_{i=1}^n P(I_i=1) = H_n$, where $H_n$ is the nth Harmonic number.