Problem statement:
The entropy of a probability vector $ p = (p_1, ... , p_n)^T $ is defined as $ H(p)= - \sum\limits_{i=1}^{n} p_i \log{p_i} $, subject to $ \sum\limits_{i=1}^{n} p_i = 1 \mbox{ and } p_i \geq 0$, where $ 0\log{0} = 0 $ by convention.
What is the largest and smallest entropy of any probability vector?
Discussion
I can maximise the expression using Lagrange multipliers without a problem. It's also equally easy to prove that the smallest entropy of any probability vector is 0, by noting that it must be non-negative, and providing an example with entropy 0.
However, I would very much like to prove that the minimum value is 0 using Lagrange multipliers (the fact that I can't seems to me to suggest that there is something that I haven't understood).
I set up the Lagrangian: $ L(p, \lambda) = - \sum p_i\log{p_i} + \lambda(\sum p_i - 1) $ and by differentiating to find stationary points obtain the system: $ {\frac{\partial{L}}{\partial{p_i}}}= -\log{p_i} - 1 + \lambda = 0 $. The obvious solution to this gives me my maximum point, but I am expecting another solution for the minimum.
- Am I misusing Lagrange multipliers?
- Is there another solution to my system that I am missing (maybe involving the "convention" about $ 0\log{0} = 0$)?
- Do I need to worry about slackness and the condition $p_i \geq 0$ to find the minimum?
- Is finding this minimum beyond the scope of this method for some reason, and my lecturers are in fact expecting me to resort to the "easy" method described above?
Any hints or answers are welcome, thank you in advance.