A recent discussion, which may be found here, examined the problem of counting the number of acyclic digraphs on $n$ labelled nodes and having $k$ edges and indegree and outdegree at most one. It was established that the bivariate mixed generating function of this class $\mathcal{G}$ of graphs on $n$ nodes and with $k$ edges is $ G(z, u) = \exp \left(\frac{z}{1-uz} \right).$ This immediately implies that the expected number of edges of a random graph from $\mathcal{G}$ is $\epsilon_n = [z^n]\left. \frac{d}{du} G(z,u) \right|_{u=1}.$ Evaluating this quantity we obtain $ \left. \exp \left(\frac{z}{1-uz} \right) z (-1)\frac{1}{(1-uz)^2} (-z)\right|_{u=1} = \left. \exp \left(\frac{z}{1-uz} \right) \frac{z^2}{(1-uz)^2} \right|_{u=1} = \exp \left(\frac{z}{1-z}\right) \left(\frac{z}{1-z}\right)^2$ Continuing with this calculation we find $ \epsilon_n = [z^n] \sum_{m\ge 0} \frac{1}{m!} \left(\frac{z}{1-z}\right)^{m+2} = \sum_{m=0}^{n-2} \frac{1}{m!} [z^n] \left(\frac{z}{1-z}\right)^{m+2} = \sum_{m=0}^{n-2} \frac{1}{m!} [z^{n-m-2}] \left(\frac{1}{1-z}\right)^{m+2} = \sum_{m=0}^{n-2} \frac{1}{m!} \binom{n-m-2+m+1}{m+1} = \sum_{m=0}^{n-2} \frac{1}{m!} \binom{n-1}{m+1}.$ This closed form is actually quite nice, but it does not answer the question that is the most obvious one for this problem, namely Is there an asymptotic expansion of $\epsilon_n$ and if yes, what is the first term?
asymptotics of the expected number of edges of a random acyclic digraph with indegree and outdegree at most one
-
0It is also important to note that Joriki pointed out a mistake in the above analysis and you should read his post as well as my reply correcting the mistake. – 2012-12-26
5 Answers
What you calculated is not the expected number of edges but the total number of edges of all graphs in $\mathcal G$ on $n$ nodes. To get the expected number of edges you'd have to divide by the number of these graphs.
Another approach for approximating the expected number of edges is to find the mode of the number of graphs as a function of the number of edges and estimate by how much the mean might deviate from the mode. The number
$ q_{nk}=\binom nk\binom{n-1}kk!=\frac{n!(n-1)!}{k!(n-k)!(n-1-k)!} $
of graphs on $n$ nodes with $k$ edges is at a maximum when the new factor $k$ being added to the denominator is equal to the factors $(n-k)(n-1-k)$ being removed:
$k=(n-k)(n-1-k)\;,$ $k=n\pm\sqrt n\;.$
To see how close the mean is to the mode at $k=n-\sqrt n$, we can approximate the logarithm of the number of graphs using Stirling's approximation:
$ \begin{align} \log q_{nk}\approx&n\log n+(n-1)\log(n-1)-k\log k \\ &-(n-k)\log(n-k)-(n-1-k)\log(n-1-k)-k \end{align} $
and thus
$ \begin{align} \frac{\mathrm d\log q_{nk}}{\mathrm dk} &\approx-\log k+\log(n-k)+\log(n-1-k)\;,\\ \frac{\mathrm d^2\log q_{nk}}{\mathrm dk^2} &\approx-\frac1k-\frac1{n-k}-\frac1{n-1-k}\;. \end{align} $
Setting the first derivative to zero yields $k=(n-k)(n-1-k)$ again as expected, and substituting $k=n-\sqrt n$ into the second derivative yields approximately $-2/\sqrt n$. Thus the number of graphs as a function of $k$ can be approximated by a Gaussian with width of order $\sqrt[4]n$, so the difference between the maximum and the mean should be at most $O(\sqrt[4]n)$. In fact numerical results suggest a much smaller difference, apparently $O(\log\log n)$ and perhaps even $O(1)$:
$ \begin{array}{r|r|r|c} \log_2n&\langle k\rangle&n-\sqrt n&\langle k\rangle-(n-\sqrt n)\\\hline 0&0.0000&0.0000&0.0000\\ 1&0.6667&0.5858&0.0809\\ 2&2.1370&2.0000&0.1370\\ 3&5.3441&5.1716&0.1725\\ 4&12.1957&12.0000&0.1957\\ 5&26.5548&26.3431&0.2116\\ 6&56.2228&56.0000&0.2228\\ 7&116.9171&116.6863&0.2308\\ 8&240.2364&240.0000&0.2364\\ 9&489.6129&489.3726&0.2404\\ 10&992.2432&992.0000&0.2432\\ 11&2002.9903&2002.7452&0.2452\\ 12&4032.2466&4032.0000&0.2466\\ 13&8101.7379&8101.4903&0.2476 \end{array} $
Here's the code to produce those numbers. It might be interesting to find the next term in the expansion analytically.
-
0I congratulate you on this calculation and I a$m$ glad to see that the value $n-\sqrt{n} + \frac{$1$}{4}$ confir$m$s my results for the closed-form expression. – 2012-12-26
Thanks Joriki for this important pointer correcting my mistake. In fact the number of graphs in $\mathcal{G}$ is given by $ n! [z^n] G(z, 1) .$ But we have $[z^n] \exp \left( \frac{z}{1-z} \right) = [z^n] \sum_{m\ge 0} \frac{1}{m!} \left( \frac{z}{1-z} \right)^m$ which is $ \sum_{m=1}^n \frac{1}{m!} [z^n] \left( \frac{z}{1-z} \right)^m = \sum_{m=1}^n \frac{1}{m!} [z^{n-m}] \left( \frac{1}{1-z} \right)^m = \sum_{m=1}^n \frac{1}{m!} \binom{n-m+m-1}{m-1} = \sum_{m=1}^n \frac{1}{m!} \binom{n-1}{m-1}.$ It follows that the value $\epsilon_n$ is given by $\epsilon_n = \frac{n! \sum_{m=0}^{n-2} \frac{1}{m!} \binom{n-1}{m+1}}{n! \sum_{m=1}^n \frac{1}{m!} \binom{n-1}{m-1}} = \frac{\sum_{m=0}^{n-2} \frac{1}{m!} \binom{n-1}{m+1}}{\sum_{m=1}^n \frac{1}{m!} \binom{n-1}{m-1}} = \frac{\sum_{m=2}^n \frac{1}{(m-2)!} \binom{n-1}{m-1}}{\sum_{m=1}^n \frac{1}{m!} \binom{n-1}{m-1}}.$ It would appear that this implies $\epsilon_n \sim n,$ answering my original question. This even includes the coefficient on $n$ and it means that small components having $o(n)$ edges do not contribute in the limit. I will wait a few hours to give you a chance to prove this last asymptotic result, which does not look all that difficult.
-
0I calculated up to $n=2^{15}$. I will reread everything you have written (reading my material carefully you will see that it confirms your work) and check back in a few hours. My point is, given that we now have closed formulas for the expectation and the variance we should be able to get the first few terms of the asymptotic expansions for these two statistics. The generating function encodes information about all factorial moments, therefore, while being more difficult to handle, they are usually superior to approximations of the first term only. – 2012-12-25
We can expand this calculation to the variance in addition to the expected value. To do so we compute $E[X(X-1)] = \frac{ [z^n] \left. \left(\frac{d}{du}\right)^2 G(z, u) \right|_{u=1}}{[z^n] G(z, 1)}$ where $X$ is the RV representing the number of edges. Now $\left. \left(\frac{d}{du}\right)^2 G(z, u) \right|_{u=1} = \left.\left( \frac{d}{du}\right)\exp\left(\frac{z}{1-uz}\right) \left(\frac{z}{1-uz}\right)^2 \right|_{u=1} = \left.\exp\left(\frac{z}{1-uz}\right) \left(\frac{z}{1-uz}\right)^4 + 2 \exp\left(\frac{z}{1-uz}\right) \left(\frac{z}{1-uz}\right)^3 \right|_{u=1} = \exp\left(\frac{z}{1-z}\right) \left(\frac{z}{1-z}\right)^4 + 2 \exp\left(\frac{z}{1-z}\right) \left(\frac{z}{1-z}\right)^3$ Now expand the two terms in turn. Left term, $[z^n] \exp\left(\frac{z}{1-z}\right) \left(\frac{z}{1-z}\right)^4 = [z^n] \sum_{m\ge 0} \frac{1}{m!} \left( \frac{z}{1-z}\right)^{m+4} = [z^n] \sum_{m=0}^{n-4} \frac{1}{m!} \left( \frac{z}{1-z}\right)^{m+4} $ or $ \sum_{m=0}^{n-4} \frac{1}{m!} [z^{n-m-4}] \left( \frac{1}{1-z}\right)^{m+4} = \sum_{m=0}^{n-4} \frac{1}{m!} \binom{n-m-4+m+3}{m+3} = \sum_{m=0}^{n-4} \frac{1}{m!} \binom{n-1}{m+3}.$ Right term, $2 [z^n] \exp\left(\frac{z}{1-z}\right) \left(\frac{z}{1-z}\right)^3 = 2 [z^n] \sum_{m\ge 0} \frac{1}{m!} \left( \frac{z}{1-z}\right)^{m+3} = 2 [z^n] \sum_{m=0}^{n-3} \frac{1}{m!} \left( \frac{z}{1-z}\right)^{m+3} $ or $ 2 \sum_{m=0}^{n-3} \frac{1}{m!} [z^{n-m-3}] \left( \frac{1}{1-z}\right)^{m+3} = 2 \sum_{m=0}^{n-3} \frac{1}{m!} \binom{n-m-3+m+2}{m+2} = 2 \sum_{m=0}^{n-3} \frac{1}{m!} \binom{n-1}{m+2}.$ It follows that $E[X(X-1)] = \frac{\sum_{m=0}^{n-4} \frac{1}{m!} \binom{n-1}{m+3} + 2 \sum_{m=0}^{n-3} \frac{1}{m!} \binom{n-1}{m+2}}{\sum_{m=1}^n \frac{1}{m!} \binom{n-1}{m-1}}.$ The question now becomes, what is the asymptotic expansion of this quotient of sums? That would put the variance within reach, given that $E[X^2] = E[X(X-1)] + E[X].$
The following admittedly quite naive program makes it possible to calculate $\epsilon_n$ for $n$ as large as $300000.$ The value that we obtain for the constant term of the expansion is $0.2496.$ This would suggest the conjecture that $\epsilon_n \sim n - \sqrt{n} + \frac{1}{4}.$
The code follows.
#include#include #include long double term(int n, int m) { long double v = (long double)1; while(m>1){ v *= (long double)((n-1)-(m-2)); v /= (long double)(m-1); v /= (long double)(m); m--; } return v; } int main(int argc, char **argv) { int n; if(argc!=2 || sscanf(argv[1], "%d", &n)!=1 || n<1){ fprintf(stderr, "expecting a positive integer\n"); exit(-1); } long double p = 0, q = 1; int m; for(m=2; m<=n; m++){ if(m%(n/10)==0) printf("%d%%\n", m*100/n); long double t = term(n, m); p += t*(long double)(m)*(long double)(m-1); q += t; } long double eps = p/q; printf("%.18Le %.18Le %.18Le\n", p, q, eps-((long double)n-sqrtl((long double)n))); exit(0); }
-
0It would be much more efficient to re-code this without the binomial coefficients, because each term in the two sums for some $m$ is a multiple of the one that precedes it. This gives a constant number of multiplications per term as opposed to $m$ multiplications per term and might be useful in the study of the asymptotics of the variance. – 2012-12-26
The following program can be used to calculate $E[X(X-1)]$ from the closed form as a quotient of sums. It would appear to support the conjecture that $ E[X(X-1)] \sim n^2 - 2 n^{3/2} + \frac{1}{2} n.$
#include#include #include long double term(int n, int m) { long double v = (long double)1; while(m>1){ v *= (long double)((n-1)-(m-2)); v /= (long double)(m-1); v /= (long double)(m); m--; } return v; } int main(int argc, char **argv) { int n; if(argc!=2 || sscanf(argv[1], "%d", &n)!=1 || n<1){ fprintf(stderr, "expecting a positive integer\n"); exit(-1); } long double p = 0, q = 0, r = 0; int m; for(m=1; m<=n; m++){ if(m%(n/10)==0) printf("%d%%\n", m*100/n); long double t = term(n, m); p += t* (long double)(m)* (long double)(m-1)* (long double)(m-2)* (long double)(m-3); q += 2*t* (long double)(m)* (long double)(m-1)* (long double)(m-2); r += t; } long double eps = (p+q)/r; printf("%.18Le %.18Le %.18Le\n", p, q, (eps-(((long double)n)*((long double)n) -(long double)2* (long double)n*sqrtl((long double)n)))/ (long double)n); exit(0); }
-
0This gives for the variance $ Var[X] = E[X(X-1)] + E[X] - E[X]^2 \sim \frac{1}{2} \sqrt{n} - \frac{1}{2}.$ – 2012-12-27