0
$\begingroup$

I'm optimizing a hash function mapping $M$ items into $N$ bins and I need a criterion for evaluating the quality of the mapping. Denoting the number of items put into bin $i$ by $x_i$, an ideal mapping would make each $x_i$ equal to either $\lfloor {\frac MN} \rfloor$ or $\lceil {\frac MN} \rceil$.

Currently, I'm using $\sum x_i ^ p$ with $p=3$ as my criterion. Are there simple closed form expressions for the expected value and variance assuming uniform random placement?

I might switch to another criterion, so I'm interested in formulas for them, too. I don't care about asymptotic expressions as the typical values are $10 < N < M < 1000$.

1 Answers 1

1

By symmetry and linearity of expectation, the expected value is just $N$ times the expected value of $x_i^p$ for one bin, that is,

$ N\sum_{k=0}^M\binom Mk\left(\frac1N\right)^k\left(1-\frac1N\right)^{M-k}k^3\;, $

which according to Wolfram|Alpha is

$ \frac M{N^2}\left(M^2+3MN+N^2-3M-3N+2\right)\;. $

for $p=3$. You can calculate the variance similarly, by expressing it as an expectation value, but the resulting expression will probably be rather complicated.

If you want to use a different value of $p$, just replace it in the W|A input.

  • 0
    Tha$n$k you for this helpful answer. Now it looks quite easy, $b$ut I could$n$'t hope to come to a usable expression in an acceptable time (and I didn't know that WA is that good).2012-10-04