2
$\begingroup$

Consider the normal distribution. We know that $$p(x| \mu, \sigma^2) = \frac{1}{\sqrt{2 \pi \sigma^{2}}} e^{-\frac{(x-\mu)^{2}}{2 \sigma^{2}}} $$

The kernel is $$ p(x| \mu, \sigma^{2}) \propto e^{-\frac{(x-\mu)^{2}}{2 \sigma^{2}}} $$ omitting the part that isn't a function of $x$. Why write $p(x|\mu, \sigma^{2})$ like this?

  • 2
    The scaling factor can always be determined since the the pdf integrates out to $1$. So we don't need to keep track of the scaling constants in front of the pdf.2012-05-05

1 Answers 1

1

Presumably because one is only interested in comparing the values of $p(x)$ for different arguments $x$. For example, maximum likelihood estimation techniques require to maximize $p(x)$ over $x$, that is, to find $x^*$ such that $p(x^*)\geqslant p(x)$ for every $x$. If $p(x)=cr(x)$ with $c$ independent on $x$, it is equivalent to ask that $r(x^*)\geqslant r(x)$ for every $x$.