I like doing it with tables, like here.
You first put your hypothesis into the columns

The column entries say the expected distribution of observations (I guess that O stands for "Observation" in your question), where it may take heads or tails values/events. The priors .5 are given to every hypothesis (coin fariness). 
Result are read from third table

where rows show distribution of hypothesis given observation. You say that heads is observed -- read first row. 
We can add more rows

and see that fair under two heads observation is 1/5 since it is the value in the fair in the hh row in the table 3

In your formula, however, instead of making two observations under assumption of equal priors .5/.5, you make a single (second) observation with priors 1/3 and 2/3 that you have got from the first experiment. Enter the values

You see, you have got .2 again in the last table. 
The algorithm first converts the column distribution of the first table into the intermediate, joint distirbution by multiplying every first table entry with the coumn weight (bayesian call it column or hypothesis "prior"). This step correpsonds to the nominator of your formula. 
In the second step, it makes the row distribution by dividing the row entries with the row totals, which correponds to the denominator of your formula. We can go back to the joint distribution if multiply the row entries with the row margins. 
Effectively, first table allows you to focus on desired hypothesis (what is observation distribution given column-hypothesis is true), second table gives joint distribution you should be aware of and third table allows you to focus on desired observation (a row) and consider what are the hypothesis probabilities, given that obswervation $O_n$.