5
$\begingroup$

Assuming that you start out with a root node, and decide with 50% probability whether or not to add two children nodes. If they do, repeat this process for them. How can you find the average length of this random binary tree?

I'm thinking along the lines of

$\displaystyle\lim_{n\to\infty}\sum\limits_{i=1}^n (\frac{n}{2})^2$.

because 1/2 * n represents 50% probability, and I'm squaring it because the tree gets exponentially larger. However, I feel like I've done something terribly wrong (probably because I have). Can anyone give me some help?

  • 0
    @Eric and @Myself are correct that the expected number of nodes is 1 in each generation, but that need not always imply an infinite expected height. To take a different example, suppose that each entire generation either doubled or was wiped out, then the expected number of nodes in each generation would be 1 *ad infinitum* but the expected number of generations would be 1.2011-03-03

3 Answers 3

2

Hint: These problems can be done by what is known in probability as the first step analysis. Assume that the expected length is $L$. Take one step and try to come up with a relation involving $L$ and solve for $L$.

If you find first step analysis difficult, think along the lines of what the probability that the length of the tree is $L$ and then use this to find expectation. For instance, the probability the length is $0$ is $\frac{1}{2}$, the probability the length is $1$ is $\frac{1}{2} \times \frac{1}{4}$ and so on. Try to find a pattern and sum it up.

  • 2
    Doesn't it get much more complicated. For example the probability the tree is of height 2 is $\frac{1}{4} \times \frac{1}{4} + \frac{1}{16} \times \frac{1}{8}$, and that the tree is of height 3 is something like $\frac{1}{4} \times \frac{5}{32} + \frac{1}{16} \times \frac{7}{64} + \frac{1}{64} \times \frac{1}{32} + \frac{1}{128} \times \frac{1}{256}$. I would not say the pattern was obvious.2011-03-03
2

Here are some empirical assertions:

If the probability that the tree is of height $n$ is $P_n$ then $P_0 = 1/2$ and $P_{n+1} = P_n \left( 1 - \sqrt{2 P_n} + \frac{P_n}{2} \right).$ [This is related to OEIS A076628 where $P_n = 2 b(n+1)^2$]

Since for $n>10$, $\dfrac{1}{n^2} < P_n < \dfrac{2}{n^2}$ [asymptotically $P_n$ is $2/n^2$], the expectation of the height of the tree is infinite.

If this is homework, there is probably an easier method.

  • 0
    It comes from pattern recognition, i.e. working out the early probabilities as 0.5, 0.125, 0.0703125, 0.046417236328125 and then about 0.0333517645485699, 0.0252941656239401, 0.0199249372625329, 0.0161459365039410. The first few are $1/2^1$, $1/2^3$, $9/2^7$, $1521/2^{15}$ where the denominators are $2^{2^{n+1}-1}$ so the next could be $71622369/2^{31}$. The numerators are squares with square roots 1, 1, 3, 39, 8463. Looking at OEIS gives a suggestion which is consistent with the next numbers, and a recurrence $b_{n+1}=b_n-b_n^2$ where $P_n=2b_{n+1}^2$. So I replaced $b$ by $P$. "Empirical"2011-03-04
1

With probability 1/2, a random binary tree consists only of the root node. Otherwise, it consists of two branches of height 1 with independent random binary subtrees hanging from each of them.

Let $H$ denote the height of the random binary tree and set $h(n):=\mathbb{P}(H>n)$, the probability that the tree's height exceeds $n$. The tree's height exceeds $n+1$ if it has more than the root node, and at least one of the subtrees has height exceeding $n$. It follows that $h(0)={1\over2},\qquad h(n+1)={1\over 2}-{1\over 2}[1-h(n)]^2,\quad n\geq 0.\tag1$

Now, $h(n)$ is decreasing and by equation (1) converges to some root of $s={1\over 2}-{1\over 2}(1-s)^2$. In other words, $h(n)\to 0$ so the tree has finite length: $\mathbb{P}(H<\infty)=1$.

To investigate $\mathbb{E}(H)$, let's rewrite equation (1) as ${1\over h(n+1)}- {1\over h(n)}={1\over 2}+{h(n)\over2(2-h(n))}.\tag2$ Adding these increments and dividing by $N$ gives ${1\over Nh(N)}- {1\over Nh(0)}={1\over 2}+{1\over N}\sum_{n=0}^{N-1} {h(n)\over2(2-h(n))}.\tag3$ Letting $N\to\infty$ in (3) shows that $Nh(N)\to 2$. In particular, $h(N)>1/N$ for large $N$ and hence
$\mathbb{E}(H)=\sum_{N=0}^\infty \mathbb{P}(H>N)=\infty.$

  • 0
    @Henry $P_n=h(n-1)-h(n)$ actually, but yeah, the two approaches are pretty similar.2012-09-12