12
$\begingroup$

Is it possible to simplify the following function: $$p(n, k) = \sum_{\substack{x_1 + x_2 + \cdots + x_k = n \\\ x_i \in \mathbb{N}}} x_1^{x_1} x_2^{x_2} \cdots x_k^{x_k},$$ or, at least, find a good (easy to compute) approximation of $\log{(p(n, k))}$?


Here is my motivation. Consider a probabilistic node $A$ with $k$ output arrows $a_1, \cdots, a_k$, and let us assume that the probability of passing the $k$-th arrow is $p_k$. Then the probability of consecutively moving through arrows $x = \langle a_{i_1}, \cdots, a_{i_n} \rangle$ is $$ p^A(x) = \prod_{i = 1}^n p_{k_i} = \prod_{i = 1}^k p_i^{s_i}, $$ where $s_i$ is the number of occurences of $a_i$ in $x$. So given a sample $x$ and a probabilistic node $A$ the optimal length of a code describing $x$ is about $\log(\frac{1}{p^A(x)})$, and the shortest code is achieved for $A$ having probabilities $p_1 = \frac{s_1}{n}, \cdots, p_k = \frac{s_k}{n}$.

Now, let us assume that we do not know probabilities at $A$. Then any code describing $x$ via $A$ has to contain some information about these probabilities. A ``uniform approach'' would look like follows: for a given sample $x$ chose the optimal probability node $A_x$, then $u(x) = p^{A_x}(x)$ is not a probability on $k^n$ as it does not sum up to $1$ (it does not contain information about choosing appropriate $A_x$); however $p(x)$ is, where $$ p(x) = \frac{u(x)}{\sum_{y \in k^n} u(y)} = \frac{\prod_{i = 1}^k s_i^{s_i}}{\sum_{t_1 + \cdots + t_k = n} \prod_{i = 1}^k t_i^{t_i}}. $$

One may take another approach based on Bayesian interpretation. Let us fix a meta-distribution $q$ on all probabilistic nodes $A$ having the same output arrows. This distribution chooses probabilities $p_1, \cdots, p_k$, that is --- non-negative real numbers such that $p_1 + \cdots + p_k = 1$ --- then for a given sample $x$ chose a node $A_{p_1, \cdots, p_k}$ with probability $q(A_{p_1, \cdots, p_k})$ and describe $x$ according to that node: $$ b(x) = \int_{p_1 + \cdots + p_n = 1, p_i \geq 0} p^{A_{p_1, \cdots, p_k}}(x) q(A_{p_1, \cdots, p_k}). $$ If $q$ is a uniform distribution (and I have not made any mistake during calculations), then $$ b(x) = \frac{\int_{p_1 + \cdots + p_k = 1, p_i \geq 0} \prod_{i = 1}^k p_i^{s_i}}{\mathit{Vol}(\Delta_k)} = \frac{\Gamma(k) \prod_{i = 1}^k\Gamma(s_i + 1)}{\Gamma(\sum_{i = 1}^k (s_i + 1))}. $$ So $$p(x) = p \prod_{i = 1}^k s_i^{s_i}$$ is proportional to the probability corresponding to the Kolmogorov complexity of $x$ seen from the perspective of a "stateless" random source, and $$q(x) = q \prod_{i = 1}^k s_i^{\underline{s_i}}$$ is inversly proportional to the number of sequences containing exactly $s_i$ of $a_i$ (if there are a lot of such sequences, then from the perspective of a "stateless" random source they are "more" random, so less interesting). Here $p, q$ are some constants. In fact, these distributions are really close --- by using Striling's formula $n^{\underline{n}} \approx \sqrt{2\pi n}(\frac{n}{e})^n$ we have $$q(x) \approx q' \prod_{i=1}^k s_i^{s_i+\frac{1}{2}}$$ where $q' = q\;e^{-n} (2\pi)^{k/2}$ is just another constant.

I would like to compare $p(x)$ with $b(x)$.

  • 1
    I've slightly modified your formulas to make them more easily readable. I interpreted $\mathit{Nat}$ as the natural numbers, so I wrote $\mathbb{N}$ instead. Also, you should enclose LaTeX in `$...$`, not in `\$...\$`.2011-05-01
  • 1
    is $0$ in $\mathbb{N}$? What is $0^0$?2011-05-01
  • 0
    Thank you for the corrections. BTW, I have written a wrong constraint on $x_i$. The right one is $x_1 + x_2 + \cdots + x_k = n$2011-05-01
  • 1
    Chris: let us assume that $0^0 = 1$2011-05-01
  • 6
    Here's an idea. Each term describes a partition of $[n] = \{ 1, 2, ... n \}$ and counts the number of functions $[n] \to [n]$ which preserve this partition (sending elements of a block to the same block). So instead of summing over partitions one can sum over functions $[n] \to [n]$ and ask how many partitions they preserve. This still seems rather complicated, but it might be more tractable to get bounds this way.2011-05-01
  • 1
    The [hyperfactorial](http://mathworld.wolfram.com/Hyperfactorial.html) will probably be needed here.2011-05-01
  • 1
    This would be much nicer with a multinomial coefficient in the sum.2011-05-01

1 Answers 1