1
$\begingroup$

Some days ago I came across this question. You may notice that I tried to answer it being however really not sure about what I was saying. (Feel free to "-1" my post !). Not being satisfied by my answer, I've been playing around with this problem and came to an odd conclusion. I'm therefore asking you to check what's wrong in my proof. I repeat the setting here :

Let $N \in \mathbb{N},\ N \geq 1$, let also $x_1,...,x_n,\ x_i \geq 1,\ \sum_i x_i = N$. Note also that $n$ may vary $1 \leq n \leq N$

(In the original post the OP possibly specified $x_i \in \mathbb{N}$ but I relaxed this constraint here)

We are interessed in the following quantity on which I manage to extract a lower and upper bound: $$\sum_{i=1}^n \log \frac{N}{x_i}$$

  • Lowerbound : $$ \sum_{i=1}^n \log \frac{N}{x_i} = n\log N - \sum_{i=1}^n \log x_i \overset{(*)}{\geq} n\log N -n\log\left(\frac1n \sum_{i=1}^n x_i \right) = \\ n \log N - n\log\frac{N}n = n\log n $$

  • Upperbound : $$ \sum_{i=1}^n \log \frac{N}{x_i} \overset{(*)}{\leq} n \log\left(\frac1n \sum_{i=1}^n \frac{N}{x_i} \right) = n\log N -n\log n + n\log\left(\sum_{i=1}^n \frac{1}{x_i} \right) \overset{(**)}{\leq} \\ n\log N -n\log n + n\log\left(\sum_{i=1}^n \frac{n}{N} \right) = n\log N -n\log n + n\log\left(\frac{n^2}{N}\right) = \\ n\log N -n\log n + 2n\log n - n\log N = n \log n $$

$(*)$ Jensen's inequality, precisely $n \log\left(\frac1n \sum_i x_i \right) \geq \sum \log x_i$

$(**)$ Setting $x_i = N/n$ which maximize the sum.

This therefore would prove that $\sum_{i=1}^n \log \frac{N}{x_i} = n \log n$ which sounds totally wrong to me however I'm not sure where this proof fails. Could you help me ? :)


EDIT : After seeing the comments, it is now clear that $(**)$ is wrong. The setting that maximizes the sum, I think, is $x_1 = N-n+1,\ x_2 = ... = x_n = 1$. Remember that $x_i$ must be $\geq 1$ in the statement.

  • 1
    Setting $x_i = N/n$ actually minimizes the sum. If for example $x_0 \rightarrow 0$, the sum is unbound.2017-02-23
  • 0
    Damn you're right, this setting minimizes the sum... However we cannot say that the sum goes to infinity since I enforced in the definition that $x_i \geq 1$.2017-02-24

2 Answers 2

2

Let's examine first this problem: Given $A$ with $0 < A \le 1/n$, find the extrema of $$L = -\sum_{i=1}^n \log x_i$$ when for all $i, A\le x_i$ and $\sum_i x_i = 1$.

since $L$ and the conditions are symmetric in their treatment of the $x_i$, any permutation of their values will have no affect on $L$. In particular, we can restrict our attention to the region where $x_n \ge x_i$ for $i < n$. Any extremum that occurs elsewhere must also occur in this region.

In the special case where $A = 1/n$, there is only one point within the region of interest: $(A, A, ..., A)$, which is therefore both maximum and minimum. So assume that $A < 1/n$. Therefore the largest coordinate $x_n > A$.

Now $x_n = 1 - \sum_{i

If $x_i < x_n$, then it can increase. But because the partial derivative is negative in that case, this causes $L$ to decrease. Conversely if $x_n > x_i > A$, then decreasing $x_i$ will cause $L$ to increase (by the 2nd derivative test, this is also true when $x_n = x_i$).

Hence, the maximum of $L$ occurs when all of the $x_i$ with $i

Now your problem is to find the extrema of $$K = -\sum_i \log\left(\frac{x_i}N\right)$$ given $$1\le x_i, \quad \sum_i x_i = N$$ Make the substitution $y_i = \frac{x_i}N$, and we get $$K = -\sum_i \log y_i$$ given $$1/n \le y_i,\quad \sum_i y_i = 1$$ By the above work, the extrema are $$\max K = n\log N - \log (N+1-n)$$ $$\min K = n\log n$$ If we allow $n$ to vary from $1$ to $N$, then $$\max K = N\log N$$$$\min K = 0$$

  • 0
    Thank you for this detailed answer sir !2017-02-24
  • 1
    I was just looking at how to rewrite it to make it clearer. The boundary of the region is a simplex, and the extrema fall on the center line where all coordinates are equal. It should be possible to express that more nicely.2017-02-24
  • 1
    @Zubzub - I've reworked the answer to make it cleaner. I also noticed that I had originally done $\sum_i x_i \le N$, while your problem was for $= N$. So i've corrected that, which simplified things considerably.2017-02-24
  • 0
    That matches with what I found later on my side. Thank you for your time !2017-02-25
0

Let $y_i = x_i/N$, then $$\sum_i \log(N/x_i) = N\log N+N\sum_i \frac1N\log\left(\frac1{Ny_i}\right) =N\log N+ND_{KL}\left(\frac 1N\big|\big|\,y_i\right),$$ where $D_{KL}$ is the Kullback–Leibler divergence.

We then get the lower bound directly from divergences being non-negative, and we can get an upper bound from the various Rényi divergences, such as $D_{KL}\le \log\max_i 1/(Ny_i)=\log1/\min_i x_i$.