According to the Wikipedia article on conditional entropy, $\sum p(x,y)\log p(x)=\sum p(x)\log p(x)$. Can someone please explain how?
Probability and Entropy
3 Answers
$\sum_{x,y} p(x,y)\log p(x)=\sum_x\sum_yp(x,y)\log p(x)=\sum_x\left(\sum_yp(x,y)\right)\log p(x)=\sum_x p(x)\log p(x)$
-
0think of using$y$as an index variable in the joint distribution - you sum them all out and are left with only the marginal on x – 2012-08-26
I find the notation irritating, whereby the same letter, $p$, is used to refer to two or more different functions.
I could write an algebraic proof, but I wonder if a concrete example might not shed more light. Suppose we have $ \begin{align} P(X=0\ \&\ Y=0) = 1/10 & & P(X=0\ \&\ Y=1) = 2/10 \\ \\ P(X=1\ \&\ Y=0) = 3/10 & & P(X=1\ \&\ Y=1) = 4/10 \end{align} $ Then $P(X=0)=4/10$ and $P(X=1)= 6/10$. So the first sum above is $ \underbrace{(1/10)\log (4/10) + (3/10)\log(4/10)} + \underbrace{(2/10)\log(6/10) + (4/10)\log(6/10) }. $ The second sum above is $ (4/10)\log(4/10) + (6/10)\log(6/10). $ So it's really just the distributive law.
In the article $p(x,y)$ is summed (integrated) over all possible values of y for a fixed value of x.
$\sum_{y\in\mathcal Y} p(x,y)= p(x)$