2
$\begingroup$

I have a set $S=\{a_1,a_2,\dots,a_n\}$. The probability with which each of the element is selected is $\{p_1,p_2,\dots,p_n\}$ respectively (where of course $p_1+p_2+\cdots+p_n=1$).

I want to simulate an experiment which does that. However I wish to do that without any libraries (i.e. from first principles).

I'm using the following method:

1) I map the elements on the real number line as follows $X(a_1)=1$; $X(a_2)=2$;$~\dots$;$X(a_n)=n$.

2) Then I calculate the cumulative probability distribution function for each coordinate (i.e $P(x < X)$) as follows: $\mathrm{cdf}(x)= P(a_1) + P(a_2) + \cdots + P(a_i)$ such that $X(a_i) \le x < X(a_{i+1})$ (thus the cdf is a step function).

3) I randomly select a real number $q \in (0,1)$ and calculate the $x$-coordinate where the line $y = q$ intersects the cdf. Since the cdf is a step function with jumps at $1,2,\dots,n$ the point would have an integer $x$-coordinate between $1$ and $n$. Let the $x$-coordinate be $m$.

4) I select that $a_i$ such that $X(a_i) = m$.

My question is does this method simulate the experiment without any bias?

I'm not getting the required results, which is why I'm a bit skeptical.

Any help will be greatly appreciated! Thanks!

  • 0
    A simple way of getting the result you want is to choose $a_i$ if $q$ is in the $i$-th interval in the following list of $n$ intervals: $(0, p_1], (p_1, p_1+p_2], (p_1+p_2, p_1+p_2+p_3], \ldots, (1-p_n, 1]$ where that last $1-p_n$ is shorthand for $p_1+p_2+\cdots+p_{n-1}$2012-05-27

1 Answers 1