3
$\begingroup$

I am looking for some insight into a problem:

Consider a group of $T$ persons, and let $a_1, a_2, ..., a_T$ denote the height of these $T$ persons. Suppose that $n$ are selected from this group at random without replacement, and let $X$ denote the sum of the heights of these $n$ persons. I'm supposed to find the mean and the variance.

This is in the section of hypergeometric distributions for my probability textbook, so I am trying to see how to turn this into an "urn" problem. Normally, I think one would split $X = X_1+...+X_T$ where $X_i = 1$ is chosen, $0$ otherwise, where the probability in this problem would be $Pr(X_i = 1) = \frac{1}{T}$, but I am confused because each $X_i$ is "weighted" by the height. I was having trouble finding other useful examples online.

EDIT: Here is how the book explains hypergeometric distributions:

Assume that $n$ balls are selected at random without replacement from a box containing $A$ red balls and $B$ blue balls. The expected number of red balls is $E(X) = \frac{nA}{A+B}$ and the variance is $Var(X) = \frac{nAB}{(A+B)^2}\cdot\frac{A+B-n}{A+B-1}$

I am trying to understand how to use these formulas for this particular problem. I am pretty sure I can calculate each straight from the definitions.

  • 0
    this is a problem from "Probability and Statistics" by Degroot and Schervish, 5.3.82011-03-23

2 Answers 2

1

It seems to me that the average is very intuitive and just what you'd expect (g). So try starting with:
$X = \Sigma_{i=1}^T a_i\;X_i$

  • 0
    Haha thanks... Yikes. I was hoping for a more elegant answer, but I guess the authors just wanted us to get our hands (very) dirty.2011-03-23
2

To compute the variance, you need to compute $E(X^2)$; the rest is simple. For simplicity I'll sketch this in the case when $n = 3$. So you have

$ E(X^2) = E((X_1 + X_2 + X_3)^2) = E(X_1^2 + X_2^2 + X_3^2 + 2X_1 X_2 + 2X_1 X_3 + 2X_2 X_3)$.

By linearity of expectation this is

$ E(X_1^2) + E(X_2^2) + E(X_3^2) + 2E(X_1 X_2) + 2(X_1 X_3) + 2(X_2 X_3) $

and some of these terms are equal to each other by symmetry; then you can write each of them in terms of the $a_i$. This works similarly (but is a bit harder to notate) for larger $n$.