2
$\begingroup$

I have a series $E_n$ and would like to prove that $E_n$ (composed of two binomials) converges to zero, where $0 \lt p, q \lt 1$:

$ E_{n} =\frac{1}{2} \sum _{i=0}^{n}\min\left[\binom{n}{i}\ q^{i} (1-q)^{n-i} ,\binom{n}{i}\ p^{i} (1-p)^{n-i} \right] $

Note that it can be easily shown that $E_{n+1} \le E_n$.

  • 0
    @joriki: Indeed it's not true for $p=q$, but it's true for $p\neq q$.2011-06-30

2 Answers 2

5

This can also be done in an elementary fashion. Since the minimum of two nonnegative quantities is bounded by their geometric mean, you have $E_n \leq {1 \over 2}\sum_{i=0}^n {n \choose i}p^{i \over 2}q^{i \over 2}(1- p)^{{n-i \over 2}}(1 - q)^{{n-i \over 2}}$ Next, use the fact that the geometric mean is bounded by the arithmetic mean, with equality iff the two quantities are equal. In particular, $p^{1 \over 2}q^{1 \over 2}= r{p + q \over 2}$ for some $0 < r < 1$, and $(1-p)^{1 \over 2}(1-q)^{1 \over 2}= s(1 - {p + q \over 2})$ for some $0 < s < 1$. Note we assume $p \neq q$ here. So you get $E_n \leq {1 \over 2} \sum_{i=0}^n {n \choose i}r^is^{n-i}\bigg({p + q \over 2}\bigg)^i\bigg(1 - {p + q \over 2}\bigg)^{n-i}$ Next, observe that $r^is^{n-i} \leq t^n$ where $0 < t =\max(r,s) < 1$. So you have $E_n \leq {1 \over 2}t^n\sum_{i=0}^n {n \choose i}\bigg({p + q \over 2}\bigg)^i\bigg(1 - {p + q \over 2}\bigg)^{n-i}$ By the binomial theorem the sum is just $1$. So you get $E_n \leq {1 \over 2}t^n$ Since $0 < t < 1$, you get $\lim_{n\rightarrow \infty} E_n = 0$ as needed.

  • 0
    Thanks much, @Zarrax! Interestingly, the rate of decrease (convergence to 0 as n->inf) of En is not only a function of |p-q|, but also of p+q.2011-07-01
5

This is true if $p \ne q$. A probabilistic approach:

Let $X$ be a random variable distributed as a Binomial$(n,p)$ normalized (divided by $n$, so that its range is in $[0,1]$), or equivalently, $X = \frac{1}{n} \sum_{i=1}^n B_i$ where $B_i$ are Bernoulli with probability $p$.

Let $Y$ be an analogous r.v. from another Binomial $(n,q)$ ($q \ne p$).

Then, informally, $E_n$ is the 'area' below the smallest probability function, and it can be expresed as the sum of two "tails". It only remains to show that each tail starts at some strictly positive distance $\delta$ from the mean (here is where the condition $q \ne p$ is required), and that it's indeed a tail (it does not include the mean). Using Chebyshev's inequality, and showing that the variance of $X$ and $Y$ tend to zero, the result is proved.

BTW: $E_n$ could be thought as the probability of error in a classical two-class bayesian classification problem, where the features are multidimensional iid Bernoulli, and we are required to show that the probability of error tends to zero as the dimension tends to infinity. Duda & Hart has some very similar exercise in chapter 2 (page 78).

  • 0
    Also, if you drop the assumption of i.i.d, and only assume independence, so we have a two-class classification problem of multidimensional independent Bernoulli, then the probability of error does not always tend to zero as the dimension tends to infinity, unless we have |q_i – p_i| > epsilon for all the marginal Bernoullis.2011-07-06