1
$\begingroup$

The formal problem is as follows: Given a collection of $m$ natural numbers, each of which has an equal probability of being from $1$ to $n$ ($m > n$), what is the expected "effective number of natural numbers" in the collection?

(The "effective number of natural numbers" is calculated by taking the relative frequency of each natural number, squaring it, adding these square terms up, and then taking the reciprocal. It is also known as the "inverse Simpson index" or the "true diversity of order 2".)

I'm doing some analysis of diversity indexes and how to use them in "usage ranking algorithms" to split types into collections that make sure that the diversity of each collection is "natural" and not "skewed" (e.g. a distribution where every single type was represented exactly equally, or one where a single organism dominated would the be the most "skewed" distributions possible). I thought this problem would be a good start to trying to tackle this problem.

If somebody can direct me to a better way of doing it or any resources on such a thing as a "natural" or "average" diversity as well, I would be much obliged.

  • 0
    I would have thought something like $\dfrac{mn}{m+n-1}$ might not be too far away for large $m$ and $n$ if $E[1/ \sum f_i^2] \approx n^2 / E[\sum f_i^2]$. For $m$ much bigger than $n$ this approaches $n$, which is what you would get if each relative frequency was exactly $\frac{1}{n}$: you might regard that as perfectly diverse or extremely undiverse in its diversity.2014-12-02

0 Answers 0