The whole question is in the title. $p(x)$ is a probability distribution, and $h$ is continuous and monotonic in $p(x)$.
The purpose is to motivate that the "degree of surpise", or the "amount of information" after observing a value of a random variable $x$ having a distribution $p(x)$ is proportional to $\ln p(x)$. The steps leading to the motivation are sketched in Bishop's "Machine Learning and Pattern Recognition", exercise 1.28; this is the last part.
I can't see a why it is so from a constructive point of view, maybe it's obvious? (Of course ensuring that $\ln p$ satisfies is trivial.)