11
$\begingroup$

I've been trying to understand the motivation for the use of the Jeffreys prior in Bayesian statistics. Most texts I've read online make some comment to the effect that the Jeffreys prior is "invariant with respect to transformations of the parameters", and then go on to state its definition in terms of the Fisher information matrix without further motivation. However, none of them then go on to show that such a prior is indeed invariant, or even to properly define what was meant by "invariant" in the first place.

I like to understand things by approaching the simplest example first, so I'm interested in the case of a binomial trial, i.e. the case where the support is $\{1,2\}$. In this case the Jeffreys prior is given by $$ \rho(\theta) = \frac{1}{\pi\sqrt{\theta(1-\theta)}}, \qquad\qquad(i) $$ where $\theta$ is the parameterisation given by $p_1 = \theta$, $p_2 = 1-\theta$.

What I would like is to understand the sense in which this is invariant with respect to a coordinate transformation $\theta \to \varphi(\theta)$. To me the term "invariant" would seem to imply something along the lines of $$ \int_{\theta_1}^{\theta_2} \rho(\theta) d \theta = \int_{\varphi(\theta_1)}^{\varphi(\theta_2)} \rho(\varphi(\theta)) d \varphi \qquad\qquad(ii) $$ for any (smooth, differentiable) function $\varphi$ -- but it's easy enough to see that this is not satisfied by the distribution $(i)$ above (and indeed, I doubt there can be any density function that does satisfy this kind of invariance for any transformation). So there must be some other sense intended by "invariant" in this context. I would like to understand this sense in the form of a functional equation similar to $(ii)$, so that I can see how it's satisfied by $(i)$.

Progress

As did points out, the Wikipedia article gives a hint about this, by starting with $$ p(\theta)\propto\sqrt{I(\theta)} $$ and deriving $$ p(\varphi)\propto\sqrt{I(\varphi)} $$ for any smooth function $\varphi(\theta)$. (Note that these equations omit taking the Jacobian of $I$ because they refer to a single-variable case.) Clearly something is invariant here, and it seems like it shouldn't be too hard to express this invariance as a functional equation. However, the more I try to do this the more confused I get. Partly this is because there's just a lot left out of the Wikipedia sketch (e.g. are the constants of proportionality the same in the two equations above, or different? Where is the proof of uniqueness?) but mostly it's because it's really unclear exactly what's being sought, which is why I wanted to express it as a functional equation in the first place.

To reiterate my question, I understand the above equations from Wikipedia, and I can see that they demonstrate an invariance property of some kind. However, I can't see how to express this invariance property in the form of a functional equation similar to $(ii)$, which is what I'm looking for as an answer to this question. I want to first understand the desired invariance property, and then see that the Jeffrey's prior (hopefully uniquely) satisfies it, but the above equations mix up those two steps in a way that I can't see how to separate.

  • 0
    Which part of the question is not dealt with [here](http://en.wikipedia.org/wiki/Jeffreys_prior#Reparameterization)?2012-10-12
  • 0
    This should be posted as a comment rather than an answer, since it is not an answer. However, the link is helpful. To answer your question, the missing bit is the bit where I said "I'd like to understand this sense [of invariance] form of a functional equation similar to (ii), so that I can see how it's satisfied by (i)." Perhaps I can answer this myself now, but if you'd like to post a proper answer detailing it then I'd be happy to award you the bounty.2012-10-12
  • 0
    Sorry but I absolutely completely do not care the least about bounties and points. My answer is written as it is because yes, I believe *you can answer this (your)self now*.2012-10-12
  • 2
    Perhaps I can, but it seems not at all trivial to me.2012-10-12
  • 2
    The comments on this question make no sense if you don't already know that @did's comment was originally an answer, which was deleted by a moderator and made into a comment, and that the following two comments were originally *comments on did's answer*.2012-10-18
  • 0
    @Nathaniel: The question says that the Jeffreys prior in equation $i$ is improper, but I think this is incorrect. It has a finite normalizing constant ($\pi$), which means it is a proper prior... aka the "arcsine distribution."2013-09-13
  • 0
    @TylerStreeter huh, you're right. I must have just assumed it was improper without checking. I've changed the question.2013-09-13

4 Answers 4