3
$\begingroup$

Is there any way to simplify the following expression?

$$ \sum_{d = 1}^k \left(\sum_{i=1}^d \frac{1}{i}\right) \frac{{n-t \choose d}{t \choose k-d}}{{n \choose k}} $$

This formula comes from the expected number of record lows over the first $k$ elements in a permutation of $[1,n]$, given some minimum threshold $t$ below which the elements don't count as a record low.

Let $L$ be the number of record lows, and let $d$ be the number of elements above the threshold. $$ E[L] = E[E[L|d]] = \sum_{d=1}^k E[L|d]\cdot P(d) $$

where $P(d)$ is a hypergeometric distribution, with n-t success states, population size $n$, and $k$ draws.

This comes up for example in this question: Expected number of cards in the stack?

  • 1
    Shouldn't it be $t$ at the top right instead of $n$?2012-05-03
  • 0
    You can try multisum Zeilberger algorithms or Carsten Schneider's sigma package for Mathematica: http://www.risc.jku.at/research/combinat/software/Sigma/index.php2012-05-03
  • 0
    @joriki thanks for catching my typo2012-05-03

2 Answers 2

1

Forget about my other answer; this is actually much simpler than that. Think of the elements below the threshold as above a threshold above all other elements instead. The crucial thing to notice is that the events "the $j$-th element is the lowest of the first $j$ elements" and "at least one of the first $j$ elements is below the threshold" are independent. The event "the $j$-th element is a record low" is simply the intersection of these events, so its probability is their product, $1/j$ times $1-{s\choose j}/{n\choose j}$, and the expected number of record lows is just the sum of these probabilities:

$$\sum_{j=1}^k\frac1j\left(1-\frac{s\choose j}{n\choose j}\right)\;.$$

Simplifying Wolfram|Alpha's take on this leads to the "closed form"

$$H_k+\psi(s-n)-\psi(-n)+\frac{\binom s{k+1}\,_3F_2(1,k+1,k-s+1;k+2,k-n+1;1)}{(k+1)\binom n{k+1}}\;,$$

where $H_k$ is the $k$-th harmonic number, $\psi$ is the digamma function and $F$ is the generalized hypergeometric function.

  • 0
    The expected value of this result with $s$ distributed uniformly from $0$ to $n$ is just $H_{k+1}-1$; see [this answer](http://math.stackexchange.com/a/140969) to the preceding question for an explanation.2012-05-04
  • 0
    The $jth$ element is a record low if $it$ is below the threshold, not if at least one of the first $j$ elements is below the threshold.2012-05-04
  • 0
    @Joe: What I said was that it's a record low if it's the lowest of the first $j$ elements *and* at least one of the first $j$ elements is below the threshold. I don't see how what you wrote contradicts that. If an element is the lowest of the first $j$ elements and at least one of them is below the threshold, then necessarily that element is below the threshold.2012-05-05
2

I'm not sure whether this is the sort of simplification you're looking for, but you can determine the same expected number directly. The $j$-th element has probability

$$\frac1n\binom{n-1}{j-1}^{-1}\sum_{m=1}^d\binom{n-m}{j-1}$$

of being a record low, so the expected number is

$$\frac1n\sum_{j=1}^k\binom{n-1}{j-1}^{-1}\sum_{m=1}^d\binom{n-m}{j-1}\;.$$

Wolfram|Alpha gives a closed form for the inner sum:

$$\sum_{m=1}^d\binom{n-m}{j-1}=\frac1j\left(n\binom{n-1}{j-1}-(n-d)\binom{n-d-1}{j-1}\right)\;.$$

The first term gives the usual expected number of record lows for the case without a threshold, and the second term gives a correction:

$$\sum_{j=1}^k\frac1j-\left(1-\frac dn\right)\sum_{j=1}^k\frac{\binom{n-d-1}{j-1}}{j\binom{n-1}{j-1}}\;.$$

Note that the cases $d=0$ and $d=n$ come out right. Wolfram|Alpha also gives a "closed form" for the correction term, but it seems unlikely to help.

P.S.: I just noticed that the result can actually be considerably simplified. With the number $s=n-d$ of values below the threshold, we have

$$\sum_{j=1}^k\frac1j-\frac sn\sum_{j=1}^k\frac1j\frac{\binom{s-1}{j-1}}{\binom{n-1}{j-1}}=\sum_{j=1}^k\frac1j-\sum_{j=1}^k\frac1j\frac{\binom sj}{\binom nj}\;.$$

That should make us wonder whether there isn't a simpler way to derive that result.

  • 0
    This definitely helps. I don't fully follow the record low probability. ${n -1 \choose j -1}$ is the total number of choices for the first j -1 elements. ${n - m \choose j - 1}$ is the number of choices for the first $j - 1$ elements out of the $n - m$ elements that are below the threshold?2012-05-03
  • 1
    @Joe: $m$ runs over the possible above-threshold values of the $j$-th element, and ${n - m \choose j - 1}$ is the number of choices for the first $j-1$ elements that would make that value a record low.2012-05-03
  • 0
    That makes sense. So we have number of choices that would make element $j$ a record low divided by total number of choices. Then where does the $\frac{1}{n}$ come from?2012-05-03
  • 0
    @Joe: The probability of the $j$-th element to have that particular value.2012-05-03