I'm confused by a bayesian way of calculating the probability $\text{Pr}(\theta \in (\theta_1, \theta_2) | y )$, where $\theta$ is assumed to have a (prior) uniform distribution on $[0,1]$, and the likelihood of the random variable $y$ is generated by a binomial sampling model (This example is taken from the textbook Bayesian Data Analysis). Using elementary rules of probability theory, we have
\begin{align} \text{Pr}(\theta \in (\theta_1, \theta_2) | y ) &= \frac{\text{Pr}(\theta \in (\theta_1, \theta_2), y )}{p(y)}\\ &= \frac{\int_{\theta_1}^{\theta_2}p(y|\theta)p(\theta)d\theta}{p(y)} \\ &= \frac{\int_{\theta_1}^{\theta_2}\binom{n}{y}\theta^y(1-\theta)^{n-y}d\theta}{p(y)}. \end{align}
I have two questions about the formula above:
- Why $\text{Pr}(\theta \in (\theta_1, \theta_2), y ) = \int_{\theta_1}^{\theta_2}p(y|\theta)p(\theta)d\theta$ ? What's wrong with this way: \begin{align} \text{Pr}(\theta \in (\theta_1, \theta_2), y ) &= \text{Pr}(y | \theta \in (\theta_1, \theta_2))\text{Pr}(\theta \in (\theta_1, \theta_2))\\ &= \int_{\theta_1}^{\theta_2}p(y|\theta)d\theta \int_{\theta_1}^{\theta_2}p(\theta)d\theta. \end{align}
- How to calculate the denominator, that is \begin{align} p(y) = \int_0^1 \binom{n}{y}\theta^y(1-\theta)^{n-y}d\theta. \end{align}
Thanks!