1
$\begingroup$

I'm going to rewrite my original question to make it a bit clearer:

Assume I have some set $P$ with $||P|| = N$ unique elements. I also have $S$ multisets, $(m_1, ..., m_S)$, of cardinality $L$, consisting of elements in $P$ chosen with uniform probability. We call a multiset, $m_i$, 'distinguishable' if it contains at least $k$ elements, though not necessarily distinct elements, that exist in no other multiset.

What is the probability of all $M$ multisets being 'distinguishable' according to this definition?


While sampling $S$ times with replacement from the set $P$, we can state the probability of never choosing the same element twice as:

Prob( $S$ unique selections from $P$ ) = $\prod \frac{(N - i)}{N}$ for $i = 0$ to $(S - 1)$

Or equivalently, we can calculate the probability that the multiset of $S$ sampled elements contains all unique elements as:

Prob( $S$ unique selections from $P$ ) = $\prod ((1-(\frac{1}{N - i}))^{(S - 1 - i)})$ for $i = 0$ to $(S - 1)$

  • 0
    @Ross, when I accidentally dropped the a set of parentheses in my first writeup, the last term was [1 - (1/(N - i))^0], which, of course, does equal zero.2011-04-13

1 Answers 1

2

This seems like an interesting problem even for $k=1$, which I would solve before attacking the more general version.

Your first calculation for the probability that a multiset has distinct elements is correct, although you are using $P$ rather than $N$ to mean $\#P$.

Your second calculation is unjustified and incorrect. You can compare it with the correct one. Suppose $N=10$ and $\#S=4$. The probability that $4$ draws are distinct is $\frac{10}{10} \times \frac{9}{10} \times \frac{8}{10} \times \frac {7}{10} = \frac{504}{1000} = 0.504$. Your second expression says $(1-\frac{1}{10}^3) \times (1-\frac{1}{9}^2) \times (1-\frac{1}{8}^1) \times (1-\frac{1}{7}^0)$. That last term is $0$, which makes the whole product $0$. If we leave it out, we get $\frac{259}{300} = 0.86333...$ which is quite different from the correct value.

It looks like you use a product like the second one in the rest of your calculations, so whatever error led you to it seems to have propagated, although I am rather suspicious of some of the other steps, too.

  • 0
    My original expression was missing a set of parentheses, and should now be fixed...2011-04-13