2
$\begingroup$

I am having trouble with the notion of a random variable. This is how I understand it for now:

When we talk about random variables we use only their CDFs or PDFs. We can do this because there is a theorem that states that for any CDF (that satisfies some properties) we can construct a probability space and define a random variable on this space which CDF will be the same as one we had. Is it correct?

But what if we have some specific probability space (maybe based on experiment or smth) $(\Omega, F, P)$. It's not even always possible to define a specific random variable on it, right?

So, the thing that I don't understand is, what is more realistic, or what happens more often: a) we observe something that we are trying to interpret as a probability space and then we are trying to define a random variable or b) we see some events or results that follow some law and we state that these are the values of a normally distributed random variable and we are not interested in it's probability space? Or maybe something else?

I know that I am very unclear but there is something that I can't capture in these notions that is hard for me even to formulate.

Another question that I have just thought of is: is it possible to sum for example a normally distributed r.v. and uniformly distributed r.v.? I am sure that the answer is yes but to do so we have to define a probability space and define both variables on this space. How to do it?

Thanks!

  • 0
    Well, leaving things as they are without signalling more clearly that you are waiting for an answer to your second question (whose answer is unequivocally "yes", by the way) is kind of strange to me, but I guess this is not explicitely contrary to any official guideline of the site.2013-03-01

1 Answers 1

1

I think I understand your confusion! Firstly ignoring real world data, the theory of random variables, measure theoretically is this;

A random variable on a probability space $(\Omega,\mathcal{F},P)$ is a real-valued function that is measurable w.r.t the sigma field $\mathcal{F}$. This is the definition I am familiar with - it says nothing about CDFs or PDFs. The distribution function is defined as $F(x)=P[X\leq x]$.

So it seems to me that all random variables have distribution functions since they are defined to have them. You are correct about the converse;

If $F$ is non-decreasing and right-continuous then there exists on some probability space a random variable $X$ for which $F(x)=P[X\leq x]$.

In terms of reconciling real world data with the machinery of measure-theoretic probability, the measure theory is the nuts and bolts that allows the mathematics to work. However it is fine to say my random variable $X$ is normally distributed and proceed from there without ever mentioning a probability space. This is because many applications of probability theory need only the workings of the PDF and/or CDF, there is no need to keep stating the huge amount of theory that goes into justifying such statements.

So I think for most applied purposes anyway, most people try and specify a distribution function that is a plausible model for their data, rather than trying to specify a probability space - this is several steps earlier.

  • 0
    @A$n$dréCaldas Tha$n$k-you Andre for pointing this out to me. I have not yet covered this stuff, but a quick glance at the Theorems I have on convolution suggests they (the theorems) do not care about the probability space the 2 r.v's are defined on. Thus 2 r.v.s can be summed regardless of the probability space they are defined on? This will then address the second question of grozhd.2012-09-24