3
$\begingroup$

I am looking for some insight into a problem:

Consider a group of $T$ persons, and let $a_1, a_2, ..., a_T$ denote the height of these $T$ persons. Suppose that $n$ are selected from this group at random without replacement, and let $X$ denote the sum of the heights of these $n$ persons. I'm supposed to find the mean and the variance.

This is in the section of hypergeometric distributions for my probability textbook, so I am trying to see how to turn this into an "urn" problem. Normally, I think one would split $X = X_1+...+X_T$ where $X_i = 1$ is chosen, $0$ otherwise, where the probability in this problem would be $Pr(X_i = 1) = \frac{1}{T}$, but I am confused because each $X_i$ is "weighted" by the height. I was having trouble finding other useful examples online.

EDIT: Here is how the book explains hypergeometric distributions:

Assume that $n$ balls are selected at random without replacement from a box containing $A$ red balls and $B$ blue balls. The expected number of red balls is $E(X) = \frac{nA}{A+B}$ and the variance is $Var(X) = \frac{nAB}{(A+B)^2}\cdot\frac{A+B-n}{A+B-1}$

I am trying to understand how to use these formulas for this particular problem. I am pretty sure I can calculate each straight from the definitions.

  • 0
    this is a problem from "Probability and Statistics" by Degroot and Schervish, 5.3.82011-03-23

2 Answers 2

1

It seems to me that the average is very intuitive and just what you'd expect (g). So try starting with:
$$X = \Sigma_{i=1}^T a_i\;X_i$$

  • 0
    It seems to me that will give us $\frac{a_1 + ... + a_T}{T}$, the average height of everyone, so we would just multiply by $n$ to give us $E(X)$, is this correct?2011-03-23
  • 0
    @Tyler; I didn't do the math, but it's hard for me to imagine any other result. Now, do you have a formula for the variance?2011-03-23
  • 0
    Well, I have $Var(X) = E((X-\mu)^2) = E(X^2) - E(X)^2$, but I am trying to use one specifically for hypergeometric distributions, and I'm not sure how to apply it. The text explains this distribution as: given A red balls and B blue balls, select n at random without replacement. I feel like my problem has "Select n balls at random without replacement, and each ball has a number, then add up the numbers." I am having problems trying to figure out how to go between the two concepts.2011-03-23
  • 0
    @Tyler; so it boils down to you need to calculate $E(X^2) = $ $E(\Sigma_i\Sigma_j a_ia_jX_iX_j)$. Hmmm. Seems like the math might be similar to how your textbook did the calculation for the hypergeometric variance.2011-03-23
  • 0
    Haha thanks... Yikes. I was hoping for a more elegant answer, but I guess the authors just wanted us to get our hands (very) dirty.2011-03-23
2

To compute the variance, you need to compute $E(X^2)$; the rest is simple. For simplicity I'll sketch this in the case when $n = 3$. So you have

$$ E(X^2) = E((X_1 + X_2 + X_3)^2) = E(X_1^2 + X_2^2 + X_3^2 + 2X_1 X_2 + 2X_1 X_3 + 2X_2 X_3)$$.

By linearity of expectation this is

$$ E(X_1^2) + E(X_2^2) + E(X_3^2) + 2E(X_1 X_2) + 2(X_1 X_3) + 2(X_2 X_3) $$

and some of these terms are equal to each other by symmetry; then you can write each of them in terms of the $a_i$. This works similarly (but is a bit harder to notate) for larger $n$.