16
$\begingroup$

I have tried something to solve the series $$\binom{n}{1}^2+2\binom{n}{2}^2 + 3\binom{n}{3}^2 + 4\binom{n}{4}^2+\cdots + n\binom{n}{n}^2.$$ My approach is : $$(1+x)^n=\binom{n}{0} + \binom{n}{1}x + \binom{n}{2}x^2 + \cdots + \binom{n}{n}x^n.$$ Differentiating the above equation $$n(1+x)^{n-1} = \binom{n}{1} + \binom{n}{2}x + \cdots + n\binom{n}{n}x^{n-1}$$

Also, $$ \left(1+\frac{1}{x}\right)^n =\binom{n}{0} + \binom{n}{1}\frac{1}{x} + \binom{n}{2}\left(\frac{1}{x}\right)^2 + \cdots + \binom{n}{n}\left(\frac{1}{x}\right)^n$$ Multiplying above two equation I get, $$\begin{align*} &{n(1+x)^{n-1}\left(1 + \frac{1}{x}\right)^n}\\ &\quad= \left( \binom{n}{1}^2 + 2\binom{n}{2}^2 + 3\binom{n}{3}^2 + 4\binom{n}{4}^2 + \cdots + n\binom{n}{n}^2\right)\left(\frac{1}{x}\right) + \text{other terms} \end{align*}$$

So I can say that coefficient of $\frac{1}{x}$ in expansion of $n(1+x)^{n-1}(1+\frac{1}{x})^n$ will give me the required answer.

Am I doing it correct,please correct me if I'm wrong ?

If I'm right,please tell me how to calculate the coefficient of $\frac{1}{x}$ ?

Based on the answers,I tried to implement the things in a C++ code.

I tried implementing the code using extended euclidean algorithm so that the problem of truncated division can be eliminated but still not abled to figure out why am I getting wrong answer for n>=3. This is my updated code : http://pastebin.com/imS6rdWs I'll be thankful if anyone can help me to figure out what's wrong with this code.

Thanks.

Solution:

Finally abled to solve the problem.Thanks to all those people who spent their precious time for my problem.Thanks a lot.This is my updated code :

http://pastebin.com/WQ9LRy6F

  • 1
    There is the [relationship](http://math.stackexchange.com/questions/27916/squared-binomial-coefficient) $$\sum_{k=0}^n \binom{n}{k}^2 x^k=(1-x)^n P_n\left(\frac{1+x}{1-x}\right)$$ where $P_n(z)$ is the Legendre polynomial...2012-03-19
  • 3
    Looks a bit cumbersome that way. A hint: consider also the sum $\sum_{k=0}^n (n-k)\binom{n}{k}^2$. (I expanded this comment in my answer).2012-03-19
  • 0
    @J.M. :Please explain it in detail.I'm not getting it.2012-03-19
  • 3
    Yet again we see the promiscuous use of the word "solve". "Evaluate" is an appropriate word here; "solve" is not. One solves equations; one solves problems. One evaluates expressions.2012-03-19
  • 0
    @code_hacker: the two sums are equal, and are therefore equal to the means of the two ways to write it.2012-03-19

4 Answers 4

2

First we address the overflow issue. Note that $10^9+7$ is relatively prime to all the numbers that come up in a naive calculation of $\binom{2n}{n}$. Indeed $10^9+7$ happens to be prime. So when we are calculating, we can always reduce modulo $10^9+7$ as often as necessary to prevent overflow.

Now to the identity. We have $n$ boys and $n$ girls. We want to choose $n$ people. The number of ways this can be done is $\binom{2n}{n}$. But we can choose $0$ boys and $n$ girls, or $1$ boy and $n-1$ girls, and so on. We can choose $k$ boys and $n-k$ girls in $\binom{n}{k}\binom{n}{n-k}$ ways, or equivalently in $\binom{n}{k}^2$ ways. This gives the standard derivation of the identity $$\binom{2n}{n}=\sum_{k=0}^n \binom{n}{k}^2.$$ Note now that the $\binom{2n}{n}$ choices are all equally likely. The expected number of boys is, by symmetry, equal to $\frac{n}{2}.$ But the probability that there are $k$ boys is $\frac{\binom{n}{k}^2}{\binom{2n}{n}}$, and therefore the expected number of boys is $$\sum_{k=0}^n k\frac{\binom{n}{k}^2}{\binom{2n}{n}}.$$ The term corresponding to $k=0$ is $0$, so can be omitted, and we get $$\sum_{k=1}^n k\frac{\binom{n}{k}^2}{\binom{2n}{n}}=\frac{n}{2},$$ which is essentially our identity.

Remark: There is a very nice book on bijective proofs called Proofs that Really Count. A title that so far doesn't seem to have been used is Mean Proofs.

  • 0
    This is the way to do things, and will definitely run fast enough. (You will have to implement a function that computes multiplicative inverse mod p, but that isn't terrible and runs very quickly.)2012-03-19
  • 0
    I have implemented the things in a C++ code.This is the link to the code http://pastebin.com/qxQsZyz2 but it's giving me wrong answer when I tried to submit it on online judge.Can any one please tell me where am I doing wrong ?2012-03-19
  • 0
    You're dividing by $i$ on line $17$, which is a truncated division. See Thomas's comment and my post about computing inverses modulo $m$ (my post) or mod (your OP notation).2012-03-19
  • 0
    Nice expected value argument!2012-03-20
17

First recall that the coefficient of $x^n$ in $(1+x)^n(1+x)^n=(1+x)^{2n}$ implies $$ \begin{align} \sum_{k=0}^n\binom{n}{k}^2 &=\sum_{k=0}^n\binom{n}{k}\binom{n}{n-k}\\ &=\binom{2n}{n}\tag{1} \end{align} $$ and then note that $$ \begin{align} \sum_{k=0}^nk\binom{n}{k}^2 &=\sum_{k=0}^nk\binom{n}{n-k}^2\\ &=\sum_{k=0}^n(n-k)\binom{n}{k}^2\tag{2} \end{align} $$ Adding the first and last parts of $(2)$ yields $$ \begin{align} 2\sum_{k=0}^nk\binom{n}{k}^2 &=n\sum_{k=0}^n\binom{n}{k}^2\\ &=n\binom{2n}{n}\tag{3} \end{align} $$ Therefore, $$ \sum_{k=0}^nk\binom{n}{k}^2=\frac{n}{2}\binom{2n}{n}\tag{4} $$

  • 0
    It looks great but I have one problem with this result.Actually I need to write a code in C to calculate S(result) where n can be as large as 10^6.Although I'm asked to print S % mod as final output where mod=10^9+7 but still calculating S which is derived will definitely overflow.So Can you help me on that part ?2012-03-19
  • 1
    @code_hacker: This identity might help $$ \begin{align} \binom{2n}{n} &=\frac{2n(2n-1)}{n^2}\binom{2(n-1)}{n-1}\\ &=\frac{4n-2}{n}\binom{2(n-1)}{n-1} \end{align} $$2012-03-19
  • 0
    @code_hacker, [this may be useful](http://pweb.nju.edu.cn/zwsun/PRCM-talk.pdf).2012-03-20
7

This kind of looks like you want to appeal to Vandermonde's convolution, or at least the method you'd use to prove it. It can be applied directly as follows:

Let $S = \sum\limits_{k=0}^n k\binom{n}{k}^2$ be the sum we want to compute. Note that $S = \sum\limits_{k=0}^n (n-k) \binom{n}{n-k}^2 = \sum\limits_{k=0}^n (n-k) \binom{n}{k}^2$. Therefore $2S = n\sum\limits_{k=0}^n \binom{n}{k}^2 = n \binom{2n}{n}$. Then

$$S = \frac{n}{2}\binom{2n}{n}.$$

  • 0
    It looks great but I have one problem with this result.Actually I need to write a code in C to calculate S(result) where n can be as large as 10^6.Although I'm asked to print S % mod as final output where mod=10^9+7 but still calculating S which is derived will definitely overflow.So Can you help me on that part ?2012-03-19
7

Building on the excellent work of @thomas-belulovich & @robjohn, but addressing your subsequently revealed goal....

You want $$ T(n) =S(n)-m\left\lfloor\frac{S(n)}m\right\rfloor =S(n)\text{ mod }m \qquad\text{for}\qquad S(n) =\frac{n}{2}{2n\choose n}. $$ Note that $$ \frac{S(n)}{S(n-1)} =\frac{n}{n-1}\cdot\frac{2n(2n-1)}{n^2} =2\frac{2n-1}{n-1}. $$ A brute force approach to this might be to calculate $$ R(k)=2(2k-1)(k-1)^{-1}\pmod m $$ for each $1 and then multiply modulo $m$ termwise to obtain $$ T(n)=S(1)\prod_{k=2}^nR(k). $$ Calculating each $R(k)=(4k-2)x$ requires $O(\log k)$ divisions using the extended Euclidean algorithm to find an $x$ so that $$ (k-1)x+my=1. $$

Extended Euclidean Algorithm: Given nonzero $a,b\in\mathbb{Z}$, find $x,y\in\mathbb{Z}$ so that $ax+by=\text{gcd}(a,b)$. (Adapted from David Joyner's book.)

  1. Set $\overline{u}=\langle{u_0,u_1,u_2}\rangle$ $\leftarrow\langle{a,1,0}\rangle$ and $\overline{v}\leftarrow\langle{b,0,1}\rangle$ (vectors in $\mathbb{Z}^3$). Set $\overline{v}\leftarrow-\overline{v}$ if $b<0$.

  2. While $v_0 \ne 0$, repeat:

  3. $\qquad$ Calculate the quotient $q=\text{quo}(u_0,v_0)=u_0-v_0\left\lfloor\frac{u_0}{v_0}\right\rfloor$.

  4. $\qquad$ Set $\quad\overline{w}=\overline{u}-q\overline{v},\quad$ then $\quad\overline{u}=\overline{v},\quad$ and then $\quad\overline{v}=\overline{w}.\quad$

  5. Return $a=u_0$, $x=u_1$, and $y=u_2$.

This is doable in not too many lines of C code, and it works as long as $m=10^9+7$ has no (prime) factors $p\le n$ (in fact, this $m$ is prime). If you needed a more efficient algorithm and $m$ were a composite product of distinct primes (which it isn't), you could use the prime factorization of $m$ and a nice fact about binomial coefficients modulo primes [Lukas 1878] to separately calculate residues $$ {a\choose b}\equiv{[a/p]\choose[b/p]}{a\mod p\choose b\mod p}\pmod p $$ modulo each factor $p$, and then recover $T(n)$ using the Chinese Remainder Theorem.

Here's a few pre-computed values: $$\matrix{ k& n=10^k &{2n\choose n}\text{ mod }m &T(n) \\0& 1 &2 &1 \\1& 10 &184756 &923780 \\2& 100 &407336795 &366839610 \\3& 1000 &72475738 &237868748 \\4& 10000 &703593270 &966325381 \\5& 100000 &879467333 &366342189 \\6& 1000000 &192151600 &799327475 }$$

If you want to find an efficient method to solve this problem, you'll probably want to look at the works of Donald Knuth, Andrew Granville, and Richard Stanley, who also has compiled outstanding lists of bijective proof problems and characterizations of the Catalan numbers $C_n=\frac1{n+1}{2n\choose n}$, to which our $S(n)$ are closely related since $S(n)={n+1\choose2}C_n$.


One might be tempted to try to shorten the computation using Wilson's theorem, $$ p\text{ prime} \quad\implies\quad (p-1)!\equiv-1\pmod p $$ Morley's (1895) congruence, $$ p=2n+1\text{ prime} \quad\implies\quad {2n\choose n}\equiv(-1)^n4^{2n}\pmod{p^3} $$ and/or Kummer's theorem using enough "landmark" primes near $n$ and $2n$, with extended Euclidean algorithm for inverses and the CRT (there's the catch!) to get the final result. For example, here are some prime pairs $q_i=2p_i+1$ near $n=10^6$ and $2n$: $$ \matrix{ i & p_i-n & q_i-2n \\ 1 & -251 & -501 \\ 2 & -191 & -381 \\ 3 & 151 & 303 \\ 4 & 193 & 387 \\ } $$ From Wilson's theorem for odd primes $q$, grouping $(q-1)!=(1\cdots\frac{q-1}2)(\frac{q+1}2\cdots q-1)$ and noting that the second term is $(-1)^\frac{q-1}2$ times the first term modulo $q$, we find that $$ \left(\left(\tfrac{q-1}{2}\right)!\right)^2 \equiv(-1)^{\frac{q+1}2} \pmod q $$ so that for prime pairs $q_i=2p_i+1$, $$ {2p_i\choose p_i}\equiv(-1)^{p_i}=-1\pmod{q_i}. $$ Thus we can compute (using the extended Euclidean algorithm for the inverses of $k$ modulo $q_i$) $$ {2n\choose n}\equiv-\prod_{k=p_i+1}^n\frac{4k-2}{k} \pmod{q_i} $$ for $i=1,2$ above. However, we cannot use the CRT to get $T(n)$. We would have to have enough prime pairs to reconstruct $S(n)$, and then compute its residue at the end. Since the central binomial coefficient is approximately $${2n\choose n}\approx\frac{4^n}{\sqrt{\pi n}}\left(1-\frac1{nc_n}\right),\qquad c_n\in(8,9)$$ we would need about 96 thousand pairs (p,q), making this approach infeasible.

  • 0
    Based on the recursive formula : S(n)/S(n-1) = (4n-2)/(n-1) why shouldn't I pre compute S(n) for all n upto 10^6.Can I do like that ?2012-03-19
  • 0
    Of course, you can precompute these if it suits your needs. Please not I made some corrections to the ratio $S(n)/S(n-1)$ and hence also (my) $R(n)$.2012-03-19
  • 0
    Wow I learnt more from this answer. My +1 for this.2012-03-19
  • 0
    @bgins : I tried to find the multiplicative inverse using extended euclidean algorithm but it's giving me wrong answer for n>=3 In fact the corresponding x(multiplicative inverse) at every step is very large.Can you please tell me where am I doing wrong ? One thing I'm sure of is that my extended_euclid() function is correct as I have tested and implemented it on the description and example of wikipedia.Thanks in advance. Link to the updated code : http://pastebin.com/imS6rdWs2012-03-20
  • 0
    You have one or two issues left. Your xgcd needs to use signed types since you are subtracting. Also, as a general purpose routine, it's not returning inverses when the gcd is greater than one (which shouldn't be a problem here since $m$ is prime and $0). Do you really need this in C++? Why not try [sage](http://www.sagemath.org/) or [sage online](http://sagenb.org/)? Type xgcd? or inverse_mod? for example (and `binomial(2*n,n) % m` for a direct, costly, arbitrary precision calculation). Although right now the public server seems to be loaded down.2012-03-20
  • 0
    @bgins : I think problem is much more than expected.I modified my code and now using only long long data types but still getting absurd results for n>=3.Here is my updated code: http://pastebin.com/FsaZrxe1 One more thing,when I'm calculating x(multiplicative inverse) it sometimes comes out to be negative.why is it coming negative ? Can you please explain the idea of this x in calulcations.On getting x,should I simply multiply it with (4k-2) and then take modulo of that.Is it right ? It might be irritating for you and I'm sorry for that.But I'm really stuck with this code.Please help me.2012-03-20
  • 0
    I got the c++ code working and it's [quite fast](http://pastebin.com/8G2Tp7nW). You just need to reduce mod $m$ sufficiently often, as @AndreNicolas recommended. Was this homework for math or computer science, or something else?2012-03-20
  • 0
    @code_hacker: so your background is computer science? or something else? The extended euclidean algorithm is a beautiful algorithm from elementary number theory. An example: gcd$(3,5)=2\cdot3-1\cdot5$. In general, when $a,b>0$, then $x$ & $y$ must have opposite signs. Because of the subtractions, it can produce negative numbers. But, miraculously as it may seem, $x$ and $y$ are never too large in magnitude, so a single addition by $m$ will "correct" a negative sign. It is generally necessary to reduce mod $m$ after every significant operation in order to avoid overflow, as mentioned before.2012-03-20
  • 0
    @bgins:Thanks finally abled to solve it.Overflow was occuring earlier which need to be reduced using mod m.Thanks for the help.2012-03-21