0
$\begingroup$

I want to come up with at least the expectation, and at best, the cdf, for a variable $Z$ that I think of as the result of a process and am not quite sure how to translate into equations.

Let $F(x) = x$, where $F(x)$ is the cdf defined over $(0,1)$ of random variable $X$ (uniform distribution).

Now let $G(y) = f1(F(y))$ where $f1$ in this case is just some function (I figure I don't need to write out the whole thing), with $Y\sim G(y)$, [EDIT: and $G$ is a valid cdf ($f1$ is a specific function that preserves cdf properties, I'm just using $f1$ for shorthand so I don't complicate the question with a long complicated function)].

Similarly, $H(w) = f2(F(w))$ and $W\sim H(w)$, [EDIT: and $H$ is a valid cdf ($f2$ is a specific function that preserves cdf properties, I'm just using $f2$ for shorthand so I don't complicate the question with a long complicated function)]

Now, I want to define a random variable $Z$ such that $Z$ is a weighted average of either a draw from $G$ (with probability $a$) or a draw from $H$ that is strictly greater than whatever the draw from $G$ was (with probability $1-a$).

So, in other words, I want to write out the distribution of $Z$ where $Z$ looks something like (I know this is not quite correct form) $Z= a Y + (1-a)(W|W>Y)$

Can anyone help me out?

Thanks so much!

  • 1
    What does $Y \sim G(y)$ mean in this instance? If $G(\cdot)$ supposed to be the cumulative distribution function of $Y$? and if so, how can you be sure that $G$ _has_ the necessary properties? $F$ does by definition, but note, by the way, that $F(x) = 1$ for $x \geq 1$ and $F(x) = 0$ for $x < 0$ which you didn't include in your definition. But for a _generic_ function $f1$ (your word, not mine), why should we believe that $G(y)=f1(F(y))$ is a valid cdf? So something else is going on that you are not telling us about.2012-02-14
  • 0
    @Dilip: I edited the question text, hopefully that clarifies things? Let me know if it still doesn't.2012-02-14
  • 0
    @DilipSarwate: I think perhaps the word "generic" has a technical definition I wasn't familiar with? I just meant that I was using the notation f1 and f2 to replace longer, more complicated functions that I'm actually using, and yes, they preserve cdf properties. Hopefully the edits help. Thanks!2012-02-14
  • 0
    @DilipSarwate : If $f1$ is an increasing function function from $[0,1]$ to $[0,1]$ and $f1(x)$ approaches $0$ as $x$ approaches $0$ and $1$ as $x$ approaches $1$, then $G$ would be a cdf. However, I wonder if Jand intended $f_1$ rather than $f1$?2012-02-14
  • 0
    @MichaelHardy Is there some specific significance for either $f1$ or $f_1$? I was just picking some random name... I could have called it "MyFunction" or "Q" or "Abracadabra"..... I hope I didn't inadvertently chose something that has some conventional meaning.2012-02-14
  • 0
    @MichaelHardy Yes, for suitable choices of $f1$ or $f_1$, we do get a CDF out of what Jand is doing. I just wanted *him* to say that he had considered the issue and was certain that $f1(F)$ or $f_1(F)$ was in fact a CDF. And No, Jand, neither $f1$ nor $f_1$ have specific meanings to me, and I suspect to Michael either.2012-02-15
  • 0
    @MichaelHardy If neither $f_1$ nor $f1$ have specific meanings, then why did you write in your first comment that maybe I "intended $f_1$ rather than $f1$"?2012-02-15
  • 0
    @DilipSarwate, Actually, I am a *her*, and thank you for making me be more clear on the issue.2012-02-15

2 Answers 2

1

Call $f_Y$ the probability density function of $Y$, $f_W$ the probability density function of $W$, and $F_W$ the cumulative density function of $W$. Then the probability density function $f_Z$ of $Z$ is $$ f_Z(z)=af_Y(z)+(1-a)f_W(z)\int_{-\infty}^z\frac{f_Y(y)\mathrm dy}{1-F_W(y)}. $$ Equivalently, the cumulative density function $F_Z$ of $Z$ is $$ F_Z(z)=F_Y(z)-(1-a)(1-F_W(z))\int_{-\infty}^z\frac{f_Y(y)\mathrm dy}{1-F_W(y)}. $$

  • 0
    Thanks, Didier! Just checking -- there's supposed to be an $a$ in the first term of the cdf equation, and since I am only defining the variables over $(0,1)$, I can change the integral range to be $0$ to $z$, right? And then the expectation of $Z$ would be $\int_0^1 (1-F_Z(z))dz$, correct? Thanks for being so helpful!2012-02-14
  • 1
    If $Y$ and $W$ are almost surely in $(0,1)$, then yes, $f_Y$ and $f_Z$ are zero outside of $(0,1)$. // The first term on the RHS of the CDF equation **HAS NO FACTOR** $a$.2012-02-14
  • 0
    I am so confused! Why isn't there a factor $a$?2012-02-14
  • 0
    Because when one integrates the first displayed equation on $(-\infty,z)$ (something you surely did by now), one obtains the second displayed equation with no factor $a$.2012-02-25
0

May apologies to Jand for referring to her as a male. Comments are restricted in length and I failed to write them in a gender-neutral way.

The question raised still seems very confusing to me. There is a totally useless random variable $X$ whose sole purpose seems to be to introduce $F$ which is the identity function from $(0,1)$ to $(0,1)$ since $F(x)=x$ for $x \in (0,1)$. Then, $G(y) = f1(F(y)) = f1(y)$ as far as I can tell, or at least for $y \in (0,1)$, where we are assured that $G$ is a CDF. Similarly for $H$. Now $Y$ and $W$ are defined as random variables with cumulative distribution functions $G$ and $H$ respectively. Their joint distribution is not specified but let us assume that they are independent. From these is constructed a new random variable $Z$, a function of $(Y,W)$ about which it is said that

$Z$ is a weighted average of either a draw from $G$ (with probability $a$) or a draw from $H$ that is strictly greater than whatever the draw from $G$ was (with probability $1−a$).

My interpretation is that there is yet another random variable $V$ which is Bernoulli$(a)$ and presumably independent of $Y$ and $W$, and we have

$$Z = Z(Y,W,V) = \begin{cases} Y, &\text{if}~ V = 1,\\ W, &\text{if}~ V = 0, ~\text{and}~W > Y,\\ ?? &\text{if}~ V = 0, ~\text{and}~W \leq Y. \end{cases}$$ In words, what happens if our biased coin came down Tails (this happens with probability $(1-a)$) meaning that we are suppose to use $W$, the draw from $H$, if possible, but using $W$ is not permitted because $W$ is not strictly larger than the draw $Y$ from $G$? Are $W$ and $Y$ drawn again and again till we end up with a $W > Y$?

  • 0
    No worries on the gender thing, I only felt compelled to correct since it was italicized, even though I know the reason for the emphasis was irreverent to gender :) As for the question, let me see if I can try to explain more clearly.2012-02-15
  • 0
    The situation is this: There is a set of a $n$ variables. One variable is a draw from $G$. That variable is, by the definition of the process I'm using, the minimum of the set. The other $n-1$ variable are draws from $H$, but since we know that by definition the draw from $G$ is the minimum of the set, all the $n-1$ draws from $H$ are strictly greater than the first draw from $G$. The final variable $Z$ is drawn randomly from the $n$ options in the set (so $a = \frac{1}{n}$). Does this help? You are right that $F$ is probably superfluous information.2012-02-15
  • 0
    I am trying to explain this as clearly as I can but I lack the appropriate training and terminology. Thank you for helping me figure this out.2012-02-15