2
$\begingroup$

I need somebody to help me to understand the following concepts:

Assuming $X$, $Y$ are random variables (r.v.'s). What does the following represents:

  1. $P(X+Y|Y)$, what is this?
  2. $P(X+Y|Y=y)$, my understanding it is a r.v..
  3. $P(X+Y=s|Y)$,my understanding it is a r.v..
  4. $P(X+Y=s|Y=y)$, my understanding it is a number.

How do they relate to each other?

Also, in case of Y a continuous r.v., consider $P(X+Y|Y=y)$, but $P(Y=y)$ is always zero. How can this conditioning be thought of?

Thanks a lot.

Edit

I haven't seen people giving answers to my question: in case of $Y$ a continuous r.v., consider $P(X+Y|Y=y)$, but $P(Y=y)$ is always zero. How can this conditioning be thought of? Just to give another example (in addition to the one I gave in comments about the uniform distribution), consider standard Brownian motion. $Pr(B_t\ge a|B_s=b)$ is clearly sensible and different from $Pr(B_t\ge a)$ for $t\ge s$. But here $Pr(B_s=b)$ is zero.

Also I read in books on "Markov Chains", for example, the notation of Markov property is stated as: $Pr(C_t|C_{t-1}, ..., C_1)=Pr(C_t|C_{t-1})$ So there is the notation $P(X)$ or for that matter, $P(X|Y)$. Is this notation short for $P(X=\text{any value}|Y=\text{any value})$?

  • 0
    @littleEinstein I thought you might be confused. If you weren't, I apologise.2011-12-21

2 Answers 2

1

In probability, we call outcomes $\omega$ (omega), and the set of all outcomes $\Omega$ (big omega). $P$, a probability measure, is a function that maps subsets of $\Omega$ into [0,1]. It must always be a function from a set into [0,1]. A random variable is a function on $\Omega$, not a subset of it, and what we really mean by $P(X)$, then, must be defined more clearly.

Probability is defined over events, not random variables. So rather than saying $P(X)$, you should be thinking of things like $P($coin lands heads$)$ or $P($it rains tomorrow$)$. In this case, $X$ refers to either a coin toss, or the weather, and if we ever write $P(X)$, it's really just short for $P(X=x)$

But $X=x$ is not a subset of $\Omega$ either, and what we really mean is $P(\{ \omega \in \Omega |X(\omega)=x\}),$ the probability measure of all events $\omega$ in which the function $X$ maps $\omega$ into $x$. Since that is quite a long thing to write, people refer to it as $P(X)$ for short, understanding that it may take several different values depending on which value $X$ takes. $X$, in turn, depends on which value $\omega$ takes. Now to conditioning.

Your intuition is correct. $P(X=x|Y)$ is a random variable, and $P(X=x|Y=y)$ is a number (adding $Y$ to $X$ doesn't really change anything).
So consider tossing a die, then we can call an outcome a number between 1 and 6; that is, $\omega_1=1,\omega_2=2...\omega_6=6$, and no others. If it's a fair die, then we'll have $P(\omega_i)=1/6$. For example, we can have we can have $X=1$ if $\omega$ is even, and $X=0$ otherwise. Or $Y=2$ if $\omega$ is prime, $Y=1$ if $\omega$ is composite, and $Y=0$ otherwise. By definition
$P(X=x):=P($set of all $\omega$ which make $X=x)$, and
$P(X=x|Y=y):=P($set of all $\omega$ which make $X=x$ and $Y=y$)/$P(Y=y)$.

$P(X=1|Y=2)=\frac{P(\{2,4,6\}\cap\{2,3,5\})}{P(\{2,3,5\})}=1/3$ $P(X=1|Y=1)=\frac{P(\{2,4,6\}\cap\{4,6\})}{P(\{4,6\})}=1$ $P(X=1|Y=0)=\frac{P(\{2,4,6\}\cap\{1\})}{P(\{1\})}=0$
As you can see, $W=P(X=x|Y)$ is a random variable depending on what $Y$ equals, and we have $P(W=1/3)=P(Y=2)=P\{2,3,5\}=1/2$ $P(W=1)=P(Y=1)=P(\{4,6\}=1/3$ $P(W=0)=P(Y=0)=P(\{1\})=1/6$ As you can calculate, $\mathbb{E}(W) = \sum(W P(W)) = P(X=x)$

That is to say, the expectation of the conditional probability of $X$ is just the probability of $X$, for each value that $X$ takes.

  • 0
    @jalopezp: Did you see my edit? What is your thought on that?2011-12-20
0

On the basis that $X$ and $Y$ are numerical random variables with known distributions, and $X$ takes discrete values, then both (1) and (2) are shorthands for (3) and (4) that are potentially confusing and probably meaningless when considered carefully. Even $P(X)$ on its own is not particularly meaningful.

For (3) , if it is not a shorthand for (4), you are correct that $P(X+Y=s|Y)$ is a random variable if $s$ is known: it is a function of $Y$, which is a random variable. It is a random function of $s$ when $s$ is unknown.

For (4) you are also correct that $P(X+Y=s|Y=y)$ is a number if $s$ and $y$ are known: it is a probability so must be in the interval $[0,1]$. It is a function of $s$ and $y$ when they are unknown.

  • 0
    The first part of your addition about conditioning on a particular value of a continuous distribution is an important but different question. The second part of your question is about another shorthand: $\Pr(C_t|C_{t-1}, \ldots, C_1)=\Pr(C_t|C_{t-1})$ is saying $\Pr(C_t=c_t|C_{t-1}=c_{t-1}, \ldots, C_1=c_1)=\Pr(C_t=c_t|C_{t-1}=c_{t-1})$ for all possible $c_i$ in the supports of the $C_i$.2011-12-20