Rochester Institute of Technology

Computer Science Department

HU-TUCKER ALGORITHM FOR BUILDING OPTIMAL ALPHABETIC BINARY SEARCH TREES

by

Sashka T. Davis

A thesis, submitted to

The Faculty of the Department of Computer Science,

in partial fulfillment of the requirements for the degree of

Master of Science in Computer Science.

Approved by:
Computer Science Department

HU-TUCKER ALGORITHM FOR BUILDING OPTIMAL ALPHABETIC BINARY SEARCH TREES

by

Sashka T. Davis

A thesis, submitted to

The Faculty of the Department of Computer Science,

in partial fulfillment of the requirements for the degree of

Master of Science in Computer Science.

Professor S. Radziszowski

Professor P. Anderson

Professor A. Kitchen

Professor E. Hemaspaandra

December 17, 1998

The purpose of this thesis is to study the behavior of
the Hu-Tucker algorithm for building Optimal Alphabetic Binary
Search Trees (OABST), to design an efficient implementation,
and to evaluate the performance of the algorithm, and the implementation.

The three phases of the algorithm are described and their time complexities evaluated. Two separate implementations for the most expensive phase, Combination, are presented achieving*O*(*n*^{2}) and
time and *O*(*n*) space complexity.
The break even point between them is experimentally established
and the complexities of the implementations are compared against their
theoretical time complexities.

The electronic version of ``The Complete Works of William Shakespeare'' is compressed using the Hu-Tucker algorithm and other popular compression algorithms to compare the performance of the different techniques.

The experiments justified the price that has to be paid to implement the Hu-Tucker algorithm. It is shown that an efficient implementation can process extremely large data sets relatively fast and can achieve optimality close to the Optimal Binary Tree, built using the Huffman algorithm, however the OABST can be used in both encoding and decoding processes, unlike the OBT where an additional mapping mechanism is needed for the decoding phase.

The three phases of the algorithm are described and their time complexities evaluated. Two separate implementations for the most expensive phase, Combination, are presented achieving

The electronic version of ``The Complete Works of William Shakespeare'' is compressed using the Hu-Tucker algorithm and other popular compression algorithms to compare the performance of the different techniques.

The experiments justified the price that has to be paid to implement the Hu-Tucker algorithm. It is shown that an efficient implementation can process extremely large data sets relatively fast and can achieve optimality close to the Optimal Binary Tree, built using the Huffman algorithm, however the OABST can be used in both encoding and decoding processes, unlike the OBT where an additional mapping mechanism is needed for the decoding phase.

- Contents
- Optimal Topological Bifurcating Arborescences. Definitions, Algorithms, and History
- Hu-Tucker Algorithm
- Implementation Models and Complexity Evaluation
- The Fast Implementation Builds OABST for 2,000,000 nodes!
- Break Even Point
- LMCP Implementation vs. Fast Implementation Compared for Large Data Sets.
- Time Complexity of All Phases of the Hu-Tucker Algorithm
- Experimental Time Complexity vs. Theoretical Time Complexity
- Example Book Encoding
- Experimental Comparison of the Cost of the Huffman and the Hu-Tucker Trees
- Conclusions

- Hu-Tucker Algorithm, Phase I, Combination
- Building OABST using the Stack Algorithm
- Leftist Heap Operations
- Bibliography
- About this document ...