5
$\begingroup$

According to the Wikipedia article on conditional entropy, $\sum p(x,y)\log p(x)=\sum p(x)\log p(x)$. Can someone please explain how?

3 Answers 3

4

$\sum_{x,y} p(x,y)\log p(x)=\sum_x\sum_yp(x,y)\log p(x)=\sum_x\left(\sum_yp(x,y)\right)\log p(x)=\sum_x p(x)\log p(x)$

  • 0
    think of using$y$as an index variable in the joint distribution - you sum them all out and are left with only the marginal on x2012-08-26
3

I find the notation irritating, whereby the same letter, $p$, is used to refer to two or more different functions.

I could write an algebraic proof, but I wonder if a concrete example might not shed more light. Suppose we have $ \begin{align} P(X=0\ \&\ Y=0) = 1/10 & & P(X=0\ \&\ Y=1) = 2/10 \\ \\ P(X=1\ \&\ Y=0) = 3/10 & & P(X=1\ \&\ Y=1) = 4/10 \end{align} $ Then $P(X=0)=4/10$ and $P(X=1)= 6/10$. So the first sum above is $ \underbrace{(1/10)\log (4/10) + (3/10)\log(4/10)} + \underbrace{(2/10)\log(6/10) + (4/10)\log(6/10) }. $ The second sum above is $ (4/10)\log(4/10) + (6/10)\log(6/10). $ So it's really just the distributive law.

3

In the article $p(x,y)$ is summed (integrated) over all possible values of y for a fixed value of x.

$\sum_{y\in\mathcal Y} p(x,y)= p(x)$