2
$\begingroup$

The goal is to maximize the following function: \begin{align} K_p(q) = q\log \frac{q}{p} + (1-q)\log \frac{1-q}{1-p} \end{align} where \begin{align} 0 \leq q \leq 1 \end{align} and $p \in (0,0.5)$ and is some constant. In other words, I'm maximizing the KL divergence in terms of one Bernoulli variable with success parameter $q$ with regard to another Bernoulli with success parameter $p$. The way I've set it up, I can see without formal calculations that it's maximized when $q=1$.

However, I'd like to use the KKT conditions to solve this problem formally and I'm running into problems doing that.

First, I set up the Lagrangian to minimize with the KKT conditions: \begin{align} \mathcal{L}(q,\mu,\lambda) & = -K_p(q) - \mu(1-q) - \lambda q \newline \textrm{where} \quad & \mu, \lambda \geq 0 \newline & 0 \leq q \leq 1 \newline & \mu(1-q) = 0 \newline & \lambda q = 0 \end{align} I then set the derivative in terms of $q$ to zero and plug it back in and get the following dual form: \begin{align} D(\mu,\lambda) & = -\mu + \log(1-p+pe^{\mu-\lambda}) \newline & = \log(e^{-\mu}-pe^{-\mu}+pe^{-\lambda}) \end{align} which I'm supposed to maximize.

But again, I run into the problem where the maximum of the dual is attained when $\mu,\lambda=0$ (and I got these values again by looking at the equation because setting derivatives to zero was infeasible). Which, according to the KKT conditions, means that q is neither zero nor one. But that can't be right, because the maximum of $K$ is attained when $q=1$.

And my obvious question is: what am I doing wrong? And how is this done correctly?

2 Answers 2

1

Is there any specific reason, you need to use KKT conditions to solve this problem? I have written a formal way of solving it below.

$f(q) = q \log \left( \frac{q}{p} \right) + \left( 1-q \right) \log \left( \frac{1-q}{1-p} \right)$ $\frac{\partial f}{\partial q} = \log \left( \frac{q}{p} \right) + q \times \frac{p}{q} \times \frac{1}{p} - \log \left( \frac{1-q}{1-p} \right) + \left( 1-q \right) \times \frac{1-p}{1-q} \times \frac{-1}{1-p} = \log \left( \frac{q}{p} \frac{1-p}{1-q} \right)$

Now $q = p$ gives the minimum of $0$ since $\frac{\partial^2 f}{\partial q^2} = \frac{1}{q} + \frac{1}{1-q} = \frac{1}{q (1-q)} > 0$

Note that when $q>p$, f'(q) > 0 and when $q, f'(q) <0. This means that the function starts at $q=0$ decreases to a minimum value of $0$ at $q=p$ and then increases and ends at $q=1$. So the maximum can occur either at $q=1$ or $q=0$.

$q=0$ gives us $f(0) = - \log(1-p)$ and $q=1$ gives us $f(1) = - \log(p)$.

So if $p \in (0,0.5]$, then the maximum occurs at $q=1$ with the maximum being $- \log(p)$ and if $p \in [0.5,1)$, then the maximum occurs at $q=0$ with the maximum being $- \log(1-p)$

  • 0
    @Sivaram Ambikasaran Yes, but in this case $g = 1-q \geq 0$ so I need to put a negative in front of $(1-q)$ to flip the inequality.2011-02-27
0

Setting the derivative of $\mathcal{L}$ to zero does not guarantee that you obtain the minimum of it. I believe $\mathcal{L}$ is a concave function of $q$ at least for some $p,\lambda,\mu$.

And also, the function $K_p$ to maximize is a convex function; therefore, solving the dual problem may not give you an optimal solution to the primal problem.

  • 0
    @JasonMond To find KKT points with inequalities, you can consider all possible cases, that is, for your problem (i) q=1, lambda=0, (ii) q=0, mu=0. For each case, you can determine other variables and see if they all satisfy all other KKT conditions. If all conditions are met, then you obtained a KKT point.2011-03-01