The definition of mutual information between two discrete, random variables, $X,Y$ is given by $$I(X,Y) = \sum_{y\in Y}\sum_{x\in X}p(x,y)\log\left(\frac{p(x,y)}{p(x)p(y)}\right),$$ where $p(x,y)$ is the joint probability distribution function of $X$ and $Y$, and $p(x),p(y)$ are the marginal probability distribution functions of $X$ and $Y$, respectively.
So, my question, how do I maximize $I(X,Y)$? It's not like I can just take the derivative of that double sum and set it equal to $0$...can I?