3
$\begingroup$

I know there is an easy way to sample standard mixture models if the mixture components can be easily sampled. But, I have a slightly different scenario and I'm wondering if there is an analogous method (or any method at all that is nice). I'll first explain the standard scenario so there is no confusion, and then explain how my problem is different. Sorry if some of this is too remedial, I just want my question to be completely clear.

A standard mixture model has pdf

$f_X(x) = \sum_i w_i f^i_X(x)$

where the $w_i$'s are non-negative and sum to one and the $f^i$'s are pdfs. A pretty important mixture model is mixture of Gaussians, where the $f^i$'s are normal distributions with different means and variances, i.e.,

$f_X(x) = \sum_i w_i \mathcal{N}(\mu_i, \sigma^2_i; x)$

I can easily sample this distribution with two samples. I'll assume there is a latent variable that governs which of the components generates the value of the random variable. Thus, I can choose $i$ with probability $w_i$ and then sample $\mathcal{N}(\mu_i, \sigma^2_i; \cdot)$ to generate my sample $\bar{x}$.

In my case, I have a slightly different mixture model

$f_X(x) = \sum_i w_i f^i_X(x) - \sum_j w_j f^j_X(x)$

Basically, some of the weights are negative so the components $w_j f^j$ subtract from the density at any $x$. Assume that $f_X$ is a pdf, so it is non-negative and integrates to one. Of course, there are methods for generating samples from arbitrary distributions, but it seems that computing the cdf for this one may be hard.

What is the best way to draw a sequence of samples from this distribution?

Thanks in advance for any help you can provide.

  • 0
    One can always assumes $f_X(x)$ by using the $f_X=f_X^+-f_X^-$ decomposition otherwise and adding terms to both sides of the mixture.2018-02-03

1 Answers 1

4

Since $\int f_X^i(x) dx = 1 \forall i$ and $\int f_X^j(x) dx = 1 \forall j,$ it is also required that $\sum_i w_i - \sum_j w_j = 1.$ Let us call $\sum_i w_i = a$, so we know that $\sum w_j = a-1$.

Here is the algorithm:

  1. Pick $i$ based on weights $w_i$.
  2. Sample y from $f_X^i(x)$.
  3. Calculate $p_a = \frac{\sum_i w_i f^i_X(x)- \sum_j w_j f^j_X(x)}{\sum_i w_i f_X^i(x)}$
  4. Sample $u$ from $Uniform(0,1)$.
  5. If $u < p_a$, then emit y, else reject this sample and go back to 1 to try another.

This is called acceptance-rejection sampling. For more information see the wikipedia article.

  • 1
    @RandomGuy, take a look at this handout: http://web.as.uky.edu/statistics/users/viele/sta695s02/rejimp/rejimp.html2011-03-01