2
$\begingroup$

When reading about the Coupon collector’s problem (CCP), I just came up with the following problem which I found very curious about, though it seems to have neither relation to the CCP nor practical value.

First, denote $S^{(k)}_n$ as a Stirling number of the second kind, a.k.a a number of ways to partition a set of $n$ labelled objects into $k$ nonempty unlabelled subsets. (This is the definition from the Wikipedia article: http://en.wikipedia.org/wiki/Stirling_number_of_the_second_kind)

The problem: Let $n$ be a fixed positive integer. Find the value of the positive integer $x$ such that the function $f(n,x) = \frac{S^{(n-1)}_{x-1}}{n^x}$ reaches its maximum.

Note: In fact, $n! f(n,x)$ is the probability of that the number of trials needed to collect all of the $n$ different coupons is $x$.

I haven’t had any ideas to solve the problem so far. (Actually, I had one, which was so naive and impractical) However, after several trials, I guess the function with the fixed $n$ will increase, reach its maximum and decrease to asymptotically zero while $x$ increases in the range of $\mathbb{Z^+}$.

Anyone has any ideas to solve the problem, please share. Thank you.

P.S: Sorry for my bad English.

1 Answers 1

2

Consider the condition $f(n,x+1)\gt f(n,x)$ for $f$ to be increasing in $x$. This is

$ \frac{S^{(n-1)}_x}{n^{x+1}}\gt\frac{S^{(n-1)}_{x-1}}{n^x}\;, $

or

$ S^{(n-1)}_x\gt nS^{(n-1)}_{x-1}\;. $

The Stirling numbers of the second kind satisfy the recurrence

$ S_x^{(k)}=kS_{x-1}^{(k)}+S_{x-1}^{(k-1)}\;, $

so with $k=n-1$ we can rearrange the inequality as

$ S^{(n-1)}_x-(n-1)S^{(n-1)}_{x-1}\gt S^{(n-1)}_{x-1} $

and then apply the recurrence to rewrite this as

$ S_{x-1}^{(n-2)}\gt S^{(n-1)}_{x-1}\;. $

This is the condition for the Stirling number of the second kind to be decreasing in $n$. In On Stirling Numbers of the Second Kind (Journal of Combinatorial Theory $7$, $116$$121$ ($1969$)), B. C. Rennie and A. J. Dobson showed that the maximum of $S^{(n)}_x$ with respect to $n$ occurs at $n$ asymptotic to $\frac x{\log x}$. By the above, this also determines the maximum of $f(x,n)$ with respect to $x$. With $n\sim\frac x{\log x}$, we have $x\sim n\log x\sim n\log n$, as we would expect from the expected number of coupons drawn in the coupon collector's problem.

It turns out that this approximation is actually very good, much better than the one for the maximum with respect to $n$, which is still off by a large factor for $n$ in the thousands. It seems that in the present case, rounding $n\log n-1$ to the nearest integer always yields the maximum $\pm1$ for $n\ge4$, and for $n\gtrsim60$ usually yields the exact maximum.

  • 1
    @VincentJ.Ruan: It's not all that weird if you consider that the maximum of $S^{(n)}_x$ with respect to $n$ is only "quotientially" asymptotic to $\frac x{\log x}$ (which is the standard meaning of "asymptotic" and "$\sim$"), and that the convergence of the quotient is quite bad -- so the difference between $n\log x$ and $n\log n$ is of the order of an already existing error. What I find surprising is that $n\log n$ approximates the mode so much better than does the mean $nH_n=n\log n+\gamma n+O(1)$.2016-06-10