6
$\begingroup$

We have a random generator that generates independent random bits with probability $P(x=1) = P$ and $P(x=0)=1-P$.

Given $N$ random independent bits, we estimate $P$ by $\hat{P} = N_1/(N_0+N_1)$. where $N_0$ is the number of $0$'s and $N_1$ is number of $1$'s. The expected value for $\hat{P}$ can be simply shown to be $P$.

a. What is the expected value for the estimated entropy defined as following: $\hat{H}=-[ \hat{P} \log(\hat{P}) + (1-\hat{P}) \log(1-\hat{P}) ]$

b. If we take $M$ independent sets of $N$ random bits as above and each time estimate the entropy using the above equation, What is the expected value for the smallest estimated entropy among those $M$ sets ?

thanks, MG

P.S. If the solution to the integral for general P is too complicated, a solution to the special case $P=1/2$ would be appreciated as well.

  • 0
    In fact, I am suggesting the above, for a better estimation for $H$ given $N_0$, $N_1$. the integral is with respect to $p$.2011-03-07

1 Answers 1

1

As noted in the comments, there is no general exact formula for $\langle \hat H_N\rangle$ but, for large $N$, one can approach it by a $\chi^2$-type limit.

To wit, $N_1=pN+v\sqrt{N}Z_N$ with $v^2=p(1-p)$, $\langle Z_N\rangle=0$ and $\langle Z_N^2\rangle=1$ for every $N$ and, when $N\to+\infty$, $Z_N$ converging in distribution to a standard normal random variable.

Hence $\hat P_N=p+U_N$ and $1-\hat P_N=q-U_N$ with $q=1-p$ and $U_N=vZ_N/\sqrt{N}$, and $ \hat H_N=-(p+U_N)\log(p+U_N)-(q-U_N)\log(q-U_N). $ Using the expansions $ \log(p+U_N)=\log(p)+U_N/p-U_N^2/(2p^2)+o(U_N^2), $ and $ \log(q-U_N)=\log(q)-U_N/q-U_N^2/(2q^2)+o(U_N^2), $ one gets $ \hat H_N=-p\log(p)-q\log(q)+U_N\log(q/p)-U_N^2/(2pq)+o(U_N^2). $ Since $\langle U_N\rangle=0$, $\langle U_N^2\rangle=v^2/N$ and $v^2=pq$, one gets $ \langle \hat H_N\rangle=-p\log(p)-q\log(q)-1/(2N)+o(1/N). $