1
$\begingroup$

If you have N boxes each containing distinct number of balls and you are allowed to choose at most x of those boxes. You can open each box at most once and then decide whether you want to select the box or not before going to the next box. The final score would be the maximum (no. of balls) from all the boxes selected. What strategy can be followed to maximize the score?

A box can hold at most A balls and at least B balls. But it is not necessary that some box holds A or B balls.

1 Answers 1

3

There are iterative solutions to such stopping problems; one generally solves them by working backwards. For example, if you are at the last box and you have 1 more box to choose, you obviously choose the last box. If you are at the second-to-last box and you have one more box to choose (and the number of balls in each box is random and uniformly distributed in $[B,A]$), you take the second-to-last box if and only if it has more than $(A+B)/2$ balls. Otherwise you take the last box sight unseen. Etc.

Here I have ignored the fact that the boxes all have distinct numbers of balls, but that case is also relatively easy to solve by working backwards.

Update:

Let's first work this out fully where the boxes are not distinct. In that case there is a threshold function $f(k,n)$ defined on all $n \leq k \leq N$ where $k$ is the number of boxes left unopened and $n$ is the number of boxes left to pick, and there's an expected score function $e(k,n)$ which tells you how much you expect to get out of the last $k$ boxes with $n$ picks. You take the $k$th-from-last box iff it has at least $f(k,n)$ balls. $f$ and $e$ defined inductively by $f(n,n)=A$ (if there are $n$ boxes left of which you have to pick $n$, you take this box no matter what's in it), $e(n,n) = n*(A+B)/2$, and by the principle that you only want to take the $k$th-from-last box if you expect it to improve your final score. That is, $m + e(k-1, n-1) \geq e(k-1,n) \Leftrightarrow m \geq f(k,n)$ or $f(k,n) = e(k-1,n) - e(k-1, n-1)\;,$ where $m$ is the occupation of the $k$th-from-last box. You also have that

$\begin{align*}&e(k,n) =\\ &P(m\geq f(n,k)) *(e(k-1,n-1) + E[m| m\geq f(n,k)]) + (1-P(m\geq f(n,k))) * e(k-1,n)\end{align*}$ or $e(k,n) = \frac{A-f(n,k) +1}{A-B+1} * (e(k-1,n-1) + (A-f(n,k))/2) + \frac{f(n,k)-B}{A-B+1} e(k-1,n)\;.$

Now we move on to the distinct-boxes case.

Here, there is a threshold function $f(k,n,S)$ and an expected score function $e(k,n,S)$, defined on all $n \leq k \leq N$ and $S \subset [B,A] ,\space |S| = (A-B+1) - (N-k)$ where $k$ is the number of boxes left unseen, $n$ is the number of boxes left to pick and $S$ is the set of box occupation numbers you have not yet seen. If the $k$th-from-last box has at least $f(k,n,S)$ balls, you take it, otherwise you move to the next box.

This function is defined inductively by $f(n,n,S)=A$ (if there are $n$ boxes left of which you can pick $n$, you take all boxes no matter what's in them), $e(n,n,S) = n * E[S]$ where $E[S]$ is the average value of the elements of $S$, and by the principle that you only want to take the $k$th-from-last box if you expect it to improve your final score. That is, $m + e(k-1, n-1, S/\{m\}) \geq e(k-1,n,S/\{ m\}) \Leftrightarrow m \geq f(k,n,S)\;.$ Also once we've defined $f(k,n,S)$ we can define

$\begin{align*}&e(k,n,S) =\\&\frac{1}{|S|} \left( \sum_{m \in S, m\geq f(k,n,S)} (m + e(k-1,n-1, S/\{m\})) + \sum_{m \in S, m< f(k,n,S)} e(k-1,n,S/\{m \}) \right).\end{align*}$

These are all tricky to compute, but you can do it by working backwards and thinking of this like a tree. You use $e(k-1,n-1)$ and $e(k-1,n)$ to compute $f(k,n)$, then use all 3 quantities to compute $e(k,n)$. This is related, for example, to option pricing in financial markets.

  • 0
    Aside: The term for the technique Craig is using here is "dynamic programming."2011-10-21