I will be getting a stream of items. I also know the sample size I need. When an item comes, I need to decide whether it will be in will be in the sample or not. I will not get second chance to either remove or add this item. But at the end, I should get samples with sample size.
I looked at reservoir sampling - it can be distributed. But it creates samples when everything has come. Also, we remove an item which was already in sample set when a new item comes.
Is there an algorithm which works for my case?