3
$\begingroup$

Let $X$ be a random variable distributed over, for example say, the Binomial Distribution. Then $P(X)$ is the probability of getting $x$ successful trials in $n$ total trials.

So I saw a notation that represents the mean of random variables that made me I feel sceptical about my understanding of all the notations I have known. So here's my understanding of the notations:

When it says the expectation of $X$, $E(X)$, does it mean over a long run, $E(X)$ is the likely number of successful trials? In other words, the expected value of $X$ is the expected number of successful trials we would expect in a long run?

When it says the variance of $X$, $Var(X)$, does it mean how spread out the probability of successful trials are? Like how far apart the probability between the successful trials are?

Now, here's the confusing part. I see a notation like this: $\overline { X } =\frac { 1 }{ n } \sum _{ i=1 }^{ n }{ { X }_{ i } } $ and this is called the mean of all the random variables. But it doesn't seem to make sense to me. $X$ is the random variable and carries the value that is the number of successful trials. The average of $X$ is like the average number of successful trials?

Does it then mean $\overline { X } =E(X)$?

Then, there is also the expectation of the mean of all the random variables, $E(\overline { X } )$. So does this represent the average of the average of all the random variables, which means $E(\overline { X } )=E(E(X))$? But at this point, I couldn't understand what it means intuitively. What does it mean here to say the average of the average of all random variables?

Similarly, $Var(\overline { X } )$ is also a confusing term to me. Since $\overline { X } $ is just the average value, what spread does it have?

What is the intuitive meaning of this $\overline { X } $ mean of all random variables $X$ and what does this add on to the meaning of $E(\overline { X } )$ and $Var(\overline { X } )$?

  • 0
    To rephrase the previous comments a bit, $E(X)$ is the average *with respect to the probability measure*, and $\text{Var}(X)$ is the average square of deviation from $E(X)$ *with respect to the probability measure*. Probability measure is something that in a sense tells you how to take meaningful averages.2012-02-20

2 Answers 2

5

You have a vast population of people who have different heights, and you choose one at random. That person's height is $X$. $E(X)$ is the average height of everyone in the population. Then you pick $20$ people at random. Their heights are $X_1,\ldots,X_{20}$. Their average height is $\bar{X} = (X_1+\cdots+X_{20})/20$. That is a random variable because if you pick another set of $20$, it has a different value---thus it varies randomly. The expected value $E(\bar{X})$ is the same as the expected value $E(X)$. But the variance $\operatorname{var}(\bar{X})$ is smaller than the variance $\operatorname{var}(X)$, because on average one sample of $20$ differs less from another sample of $20$ than one individual differs from another individual.

  • 1
    Just to be explicit: $\bar{X}$ is based on only $20$ values, but $E(\bar{X})$ is based on all possible samples of $20$ from the whole population.2012-02-20
3

You should think of $E(X)$ and $Var(X)$ (when they exist) as parameters which describe the distribution of the random variable $X$. They are numbers. They are not random variables. So, although you can write $E(E(X))$, it's not very useful since the expected value of a constant is just the constant. So you have $E(E(X)) = E(X)$.

Sometimes it is helpful to call $E(X)$ the "population average."

Now, you should think of $\bar{X}_n = \frac{1}{n}\sum_{i=1}^n X_i$ as simply a function of $n$ random variables, albeit a very useful one. You shouldn't equate it in your mind with the population average $E(X)$ (a number). Rather $\bar{X}_n$ is a random variable.

Given some outcome $\omega$, you observe the sample values $x_1 = X_1(\omega), \dots, x_n = X_n(\omega)$ and you can compute the so called "sample average," which is a particular realization of $\bar{X}_n$, namely

$\bar{X}_n(\omega) = \frac{1}{n}\sum_{i=1}^n X_i(\omega) = \frac{1}{n}\sum_{i=1}^n x_i$

Now, this sample average is a number, but it could be quite different from the population average, the number $E(X)$. However, the Law of Large Numbers says that if $n$ is large enough then (any realization of) $\bar{X}_n$ will be close to $E(X)$.

And, yes, you can take the expected value of the random variable $\bar{X}_n$, as you suggest, and you will simply have

$E(\bar{X}_n) = \frac{1}{n}\sum_{i=1}^n E(X_i) = E(X)$

(The last equality assumes the $X_i$ all come from a distribution with mean $E(X)$.)

As for the variance, no, $Var(X)$ does not measure the spread between successes in repeated trials. Rather, it is a measure of the spread of the possible values of the function $X$ about its mean $E(X)$. If the variance is larger (smaller) that means there is a larger (smaller) chance $X$ will take on a value far from $E(X)$. In your example, the possible values are 0 and 1, with probabilities $(1-p)$ and $p$, say. The variance would be much bigger if you chose, say, 0 and 100 as the possible values for $X$.

Finally, note that the variance of $\bar{X}_n$ gets smaller as $n$ gets larger. This is telling you that the larger $n$ is, the more likely the observed value of $\bar{X}_n$ will be close to $E(X)$.

  • 1
    @MichaelHardy I disagree. Even if we're speaking of a single distribution, if $E(X)$ exists, I see no problem with calling it a parameter of the distribution of $X$. I suppose the term could be defined very narrowly, reserved for only those parameters which uniquely identify a distribution within a parametrized family, but that is too narrow, imho. I prefer to take parameter to mean "feature of a distribution." (For example, I would say the moments of a chi-square are parameters of the distribution, even though they are not *the* parameter we use to identify the distribution.)2012-03-24