Suppose we have two players who are perfectly rational (with their perfect rationality common knowledge) playing a game. On round one both players play in a prisoners dilemma type game. With payoffs (1,1) for mutual cooperation, (.01,.01) for mutual defection and (1.01,0) and (0,1.01) for the situations where player 1 and player 2 defect respectively. They may talk before all rounds.
After playing a round they flip a coin, if coin is the result is tails, we end the game. If the result is heads, they play another round with payoffs multiplies by $.9^n$ where n is the number of rounds that have been played. If the result is tails, the game is over.
I'm confused by the two following, both seemingly plausible arguments. Argument 1, The game is clearly equivalent to the following game. flip a coin until you get tails. call the number of coin flips n. Keep n hidden from the players. For n rounds the players will play the prisoners dilemma with payoffs as before (That is, as before, multiplied by $.9^n$ where n is the round number). Both players reason as follows "this game is nothing more than an iterated prisoner dilemma where I don't know the number of rounds, given that for any finite iterated prisoners dilemma (which this is with probability 1, where, because payoff is bounded, we can ignore the cases where it is not). Thus Because the iterated prisoners dilemma for any fixed finite number of rounds has unique nash equilibrium "defect" my optimal strategy, regardless of n, is to defect, thus I should defect.
Argument 2 I will tell my opponent that I am going to play a grim trigger strategy. That is, I will collaborate until he defects, after which, I will always defect. He will reply that he is also playing a grim trigger strategy. Given that there is always a 50% chance of a next round with .9 the payoff of the current one. Defecting on any round will cost me, on average at least .99*.45>.01 thus I will never have any incentive to defect. Neither will he. Thus we will always cooperate.
There is clearly a contradiction between these two arguments. I'm wary of both the first strategy's claim that if we use a hidden, but well defined, random variable to choose which of infinitely many games to play, all of which have the same move in their unique nash equilibria that same move is the nash equilbrium move of the averaged game. (Which, though seductive, seems close enough to infinity to not be inherently trust worthy). The second argument seems to me to rely on some symmetries which I'm not sure are valid assumptions.
I'm curious about two questions. Which of the above arguments is fallacious and where is the fallacy? And what is the (are the) nash equilbria and are they unique?