Consider a set $\Omega$ with $N$ distinct members, and a function $f$ defined on $\Omega$ that takes the values 0,1 such that $ \frac{1}{N} \sum_{x \in \Omega } f(x)=p$. For a subset $SāĪ©$ of size n, define the sample proportion $p:= p(S)= \frac{1}{n} \sum_{x\in S} f(x)$. If each subset of size $n$ is chosen with equal probability, calculate the expectation and standard deviation of the random variable $p$.
subsets probability question
-
0It's fairly easy to prove that the expectation is p, but for the standard deviation I have no idea. I think there is an easy way to compute this but I can't find it ā 2012-06-12
2 Answers
It helps to introduce indicator random variables here. For each $x\in\Omega$, let $Z_x$ be the indicator random variable that takes the value 1 if $x\in S$, and value 0 otherwise.
We can express $p(S)={1\over n}\sum_{x\in\Omega} Z_x\cdot f(x),$ where the sum is no longer over the random set $S$. Since all points are equally likely to be elements of $S$, it is not hard to calculate $\mathbb{E}(Z_x)={n\over N},\quad \text{Var}(Z_x)={n\over N}\left({1-{n\over N}}\right), \quad \text{cov}(Z_x,Z_y)={-n\over N^2} {N-n \over N-1}\text{ for }x\neq y.$
Using linearity of expectation, and bilinearity of covariance, after some calculation we get $\mathbb{E}(p(S))={1\over N}\sum_{x\in\Omega} f(x),$ and $\text{Var}(p(S))={1\over n} {N-n \over N-1} \left[{1\over N}\sum_{x\in\Omega} f(x)^2- \left( {1\over N}\sum_{x\in\Omega} f(x)\right)^2\right].$
-
1@Tom No problem! It is not that obvious. ā 2014-03-09
I think the answer is:
a) $E[\bar{p}] = p$,
b) Var$[\bar{p}] = \frac{\sqrt{p(1-p)}}{\sqrt{n}}$.
I believe the answer can be found on page 10 of http://math.arizona.edu/~faris/stat.pdf
~JD