I have a problem that I have approached two different ways. The approaches give me two different answers. I have matched the answers against a simulation, so I think I know which one is right. But, I'm not certain. And, regardless, I don't understand why both approaches are not correct (and, since apparently at least one of them is wrong, maybe they both are). I would appreciate any insight: Is either method I used correct? If one or both is not, why not?
The problem: Assume there are 4 pairs of draws from a variable $X$ represented by $F(x) = (1-(1-x)^2)^2$ defined over $(0,1)$.
Now, take the minimum of each pair. Then take the maximum of the four minimums. Call that variable $X_m$. Now, take the other draw from the same pair as $X_m$. Call it $X_p$. Then, average $X_m$ and $X_p$. Call the random variable represented by that average $Y$. I am looking for $\mathbb{E}Y$.
Here is how I have approached it:
Method 1: $X_m$ is the max of 4 draws from the min of 2 draws from $X$. Thus, $F_m(x) =(1-(1-(1-(1-x)^2)^2)^2)^4$. Now, $X_p >X_m$ by definition. So, $F_p(x) = P[X_p < x | X_p> x_m] = \int_0^x \frac{F(x)-F(y)}{1-F(y)}f_m(y) \, dy$ Then, I average $f_m(x)$ and $f_p(x)$ to get $f_y(x)$. This way, $\mathbb{E}Y = \int_0^1yf_y(y) \, dy =0.61278$
Method 2: Given that there are 8 original draws from $X$, they can be labeled as 8 order statistics. Without changing up the pairings, label the smallest value as $X_1$, the next smallest as $X_2$, etc. Now, $X_m$ by definition can only be $X_4$ to $X_7$, and $X_p$ can only be $X_5$ to $X_8$. Working out all possible arrangements the order statistics could have been randomly paired in, we see that all values $X_4$ to $X_8$ are equally likely to occur in the set $\{X_m, X_p\}$. Thus, $Y$ is the average of the 4th -8th marginal order distributions of 8 draws from $X$.
To get $F_y(y)$ I use the standard equation for order statistics. $f_y(y) = \frac{1}{5}\sum_{k=4}^{8}\frac{8!}{(k-1)!(8-k)!}F(x)^{k-1}(1-F(x))^{8-k}f(x)$ Then, $\mathbb{E}Y = \int_0^1yf_y(y) \, dy= 0.590928$
I created a simulation in Excel, where the average $y$ over 500,000 trials is the same to four decimal places as one of the answers above. I do not know, however, if that means the method is correct (it depends partially on how good Excel's rnd() function is). Can anyone tell me why these methods do not produce the same results; if (and which) one of them is correct; and where my reasoning errors are?