3
$\begingroup$

I'm trying to estimate the cumulative number of distinct values when sampling with replacement from a changing population of integers over time. Concretely (and forgive my awful notation here), I'm assuming that there is some initial population ($m_0 = \{0,1,2,...n_0\}$) which experiences a constant rate $d$ at which members are removed from the pool and a constant rate $a$ at which members are added. Which members are removed is random, s.t. the size of a set at some time period can be written as:$|m_t| = |m_{t-1}| * (1 + a - d)$

I know that for some set time period the estimated number of distinct values I see when drawing $s$ samples from a population of size $p$ is one of the solutions to the Birthday Problem, i.e.: $p*(1-(1-1/p)^s)$ At this point, however, I'm a little stuck. I'm tempted to try to go the route of estimating the expected number of set members not seen in a cumulative time period, and I've also been encouraged to try to solve a more incremental version of the above problem as a stepping-stone (e.g. to try and solve the case where after each sample is taken the underlying population increases by one)

At this point I've spent a good couple of hours trying to find any leads on this and haven't had much success, so any and all pointers, help, etc. would be appreciated. Thanks!

  • 0
    @joriki: I've updated the notation around the size of $m_t$; sorry for the confusion, and I've updated the equation to correctly represent the growth/death paradigm.2012-12-06

0 Answers 0