The game goes like this:
There are 20 cards to begin with, among which, 12 cards are marked with 1, 3 cards are marked with +2 picks, 2 cards are marked with +3 picks, 2 cards are marked with x2, and the last card is marked with x3.
So, it is: 1,$1,$1,$1,$1,$1,$1,$1,$1,$1,$1,1,+2,+2,+2,+3,+3,x2,x2,x3
A player is allowed to randomly pick 5 cards from the shuffled deck. And if he picks the card such as +2, he is allowed to pick an extra 2 cards from the deck without replacement. And if he gets +2 or +3 again, he picks again, over and over. The overall prize a player can get is the product of cash prize sum and multiplier product.
e.g. a player might get this: 1,$1,$1,$1,+2,x2,x2 So he gets $4 x2 x2 = $16 (multipliers applied multiplicatively)
SO, what's the expected prize he can get?
This game cannot be modeled as a time homogeneous markov chain. Is there any model or any idea to solve this kind of problem rather than do a recursive dynamic programming on a pc?