Let $p_{ij}=P(X=i,Y=j)$ be the joint distribution, $P(X=i)=p_i=\sum_j p_{ij}, P(Y=j)=q_j=\sum_i p_{ij}$ be the marginal distributions, and $p_{i|j}=\frac{p_{ij}}{q_j}$ be the conditional distribution. Then the conditional entropy is defined as $E[h(X|Y)]=-\sum_j \sum_i p_{i|j} \ln p_{i|j} q_j $ Show $E[h(X|Y)]\leq h(X)$ where $h(X)$ is the entropy of $X$. I'm at a lost as to even know where to begin.
Conditional Entropy is less than entropy
4
$\begingroup$
probability
-
0It is called "information can’t hurt". – 2017-08-30
1 Answers
5
First, you give an incorrect definition for $\mathbb{E}(h(X|Y))$. It should instead read as follows: $ \mathbb{E}(h(X|Y)) = - \sum_j q_j \sum_i p_{i|j} \ln p_{i|j} $
Define relative information $I(X, Y) = H(X) - \mathbb{E}(h(X|Y))$. Then $I(X,Y) = - \sum_i p_i \ln(p_i) + \sum_{j,i} p_{i|j} q_j \ln p_{i|j} = - \sum_{i,j} p_{i,j} \ln(p_i) + \sum_{j,i} p_{i,j} \ln p_{i|j} $ hence $ I(X,Y) = \sum_{i,j} p_{i,j} \ln \frac{p_{i,j}}{p_i q_j} $ The function $\varphi(x) = x \ln(x)$ is convex for $x>0$. Thus, using Jensen's inequality:
$ I(X,Y) = \sum_{i,j} p_i q_j \cdot \varphi\left( \frac{p_{i,j}}{p_i q_j} \right) \ge \varphi\left( \sum_{i,j} p_i q_j \cdot \frac{p_{i,j}}{p_i q_j} \right) = \varphi(1) = 0 $
-
0It is noteworthy that in order to invoke Jensen's inequality, the coefficients must sum up to 1. This is true in this case since $\sum_{i,j} p_iq_i = \sum_i p_i \sum_j q_j = 1$. – 2013-11-21