1
$\begingroup$

Imagine I have a biased coin such that the probability of heads is $P(H)=p$ and $P(T)=1-p$. The outcomes of consecutive flips are independent. If $W$ is the event of $p$ taking on a given value, and $F_1$ and $F_2$ are the results of the first and second coin flips, respectively, then the following holds by Bayes Rule (by this Cross Validated post):

$$P(W|F_1,F_2)=\frac{P(F_1,F_2|W)P(W)}{P(F_1,F_2)}=\frac{P(F_2|W,F_1)P(F_1|W)P(W)}{P(F_2|F_1)P(F_1)}.$$

Clearly, however, I need to know the value of $P(F_2|W,F_1)$ first. I understand that $P(W)$ is simply our prior, and that $P(F_1|W)$ is simply the probability of the event $F_1$ given the value of $p$ (i.e. for $F_1$ being tails, and $p = 0.2$, then $P(F_1|W)=0.8$). Intuitively, it seems as though $P(F_2|W,F_1)$ should always equal $P(F_2|W)$, as $F_1$ and $F_2$ are independent. Is this the case? How can I derive the value of $P(F_2|W,F_1)$?


Update: I believe I've generated a proof that $F_1$ and $F_2$ being independent implies $P(F_2|W, F_1)=P(F_2|W)$:

Proof: Assuming $P(W) \neq 0$ (i.e. conditional probabilities on $W$ are well-defined), $$P(F_2|F_1,W)=\frac{P(F_2,F_1,W)}{P(F_1,W)}=\frac{P(F_2,F_1|W)P(W)}{P(F_1|W)P(W)}=\frac{P(F_2,F_1|W)}{P(F_2|W)}=\frac{P(F_2|W)P(F_1|W)}{P(F_1|W)}$$

which equals $P(F_2|W)$.

Is this proof correct?

  • 1
    If $p$ has a discrete distribution, then a better approach might be $\Pr(p=p_i \mid F_1,F_2) $ $=\dfrac{\Pr( F_1,F_2 \mid p=p_i)\Pr(p=p_i )}{\Pr(F_1,F_2)} $ $=\dfrac{\Pr( F_1,F_2 \mid p=p_i)\Pr(p=p_i) }{\displaystyle \sum_j \Pr( F_1,F_2 \mid p=p_j)\Pr(p=p_j)}$ where the independence gives $\Pr( F_1,F_2 \mid p=p_i)=\Pr( F_1 \mid p=p_i)\Pr(F_2 \mid p=p_i)$2017-01-05
  • 0
    @Henry Ahh, I'm following. The reason I formed the equation as I did above was because the example of Bayesian probability I saw showed sequential updates, i.e. take the prior distribution, then multiply by $P(F_1|W)$, then renormalize, then multiply by $P(F_2|F_1, W)$, etc. Is there a straightforward way to calculate $P(F_2|F_1, W)$?2017-01-05
  • 1
    The conventional or easiest way to think about it is that you start with a prior distribution for your random variable $p$ such as $\Pr(p=p_i)$ for all $p_i$; you then update it with your first observation to give you a posterior distribution for $p$ such as $\Pr(p=p_i \mid F_1)$. You then recycle this to use as the new prior distribution which you then update with your second observation $F_2$ to give a revised posterior distribution $\Pr(p=p_i \mid F_1, F_2)$.2017-01-05
  • 2
    With independence, the first calculation is $\Pr(p=p_i \mid F_1) $ $=\dfrac{\Pr( F_1 \mid p=p_i)\Pr(p=p_i ) }{\displaystyle \sum_j \Pr( F_1 \mid p=p_j)\Pr(p=p_j)}$ while the second calculation is then $\Pr(p=p_i \mid F_1,F_2) $ $=\dfrac{\Pr( F_2 \mid p=p_i)\Pr(p=p_i \mid F_1) }{\displaystyle \sum_j \Pr( F_2 \mid p=p_j)\Pr(p=p_j \mid F_1)}$2017-01-05
  • 0
    @Henry Excellent -- thank you so much! That makes far more sense.2017-01-05
  • 0
    @Henry How might I be able to prove the latter calculation? Under what circumstances does it hold, i.e. is it true iff $F_1$ and $F_2$ are conditionally independent under $p = p_i$?2017-01-08

0 Answers 0