Suppose we have an optimal prefix-free code on a set $C = \{0, 1, \dots , n − 1\}$ of characters and we wish to transmit this code using as few bits as possible. How to represent any optimal prefix-free code on $C$ using only $2n − 1 + n\lceil\log n\rceil$ bits.
My approach:
Begin with $n$ trees, each consists of a single node corresponding to one message word, with the weight of $p(i)$
Repeat until there is only one tree. pick two subtrees with smallest weights combine them by adding a new node as root, and make the two trees its children. The weight of the new tree is the sum of the weight of two subtrees With a heap, each step of combining tree takes $O(\log n)$ time, and the total time is $O(n \log n)$.
Is there any solid proof?