next up previous contents
Next: Conclusions Up: The Fast Implementation Builds Previous: Comparison of Different Compression

   
Experimental Comparison of the Cost of the Huffman and the Hu-Tucker Trees

This section will compare the compression ratios of the two algorithms. Random number input sequences are generated and fed to the engine of the two algorithms. The cost of the OABST, generated by the Hu-Tucker algorithm and the OBT, generated by the Huffman is averaged by the sum of the weights in the input sequence.

For every size of input sequences 5 different random inputs are examined and the compression ratio is averaged, for the data presented in Table 4.15 and Figure 4.14. Data sets in the range of 1000 to 99000 nodes are examined with increment of 1000 nodes.

The experimental data shown in Table 4.16 and Figure 4.17 has only one data set per input size examined due to the length of the experiment.

As can be seen from the tables in Figure 4.15 and Figure 4.15, the difference in the compression ratio, expressed as bits per word is really negligible. For the range of input sets between 1000 and 99000 nodes the minimum and maximum differences are 0.0731 and 0.0947respectively. For input data sets between 100000 and 1000000 it is between 0.0901 and 0.0943. The difference in the average length of the encoding word stabilizes around 0.09 bits per word. It can be seen from the graphs that it does not exhibit any tendency to grow or shrink. This gives us enough confidence to expect that for any larger input data sets the difference observed will be close to 0.09 bits per word.

The Huffman algorithm has very simple implementation in comparison with the Hu-Tucker algorithm. However the tree built by the Huffman algorithm can only be used for encoding and additional hashing mechanism is needed for the decoding process, especially for alphabets containing millions of nodes.


  
Figure 4.14: Comparing compression ratios of the Hu-Tucker and the Huffman algorithms.

\includegraphics{Huff-HT_bits.eps}


  
Figure 4.15: Compression ratio measured in bits per word for random sequences in the range of 1000 to 99000 nodes.

\begin{figure}
\scriptsize
\begin{tabular}{\vert\vert l\vert l\vert l\vert l\ver...
....4328 & 15.3507 & 99,000 & 16.4288 & 16.3364 \\ \hline
\end{tabular}\end{figure}


  
Figure 4.16: Comparison of compression ratios of Hu-Tucker and Huffman algorithms for very large data sets.

\begin{figure}
\begin{tabular}{\vert\vert l\vert l\vert l\vert\vert} \hline \hli...
....5171 \\ \hline
1000000 & 19.766 & 19.6727 \\ \hline
\end{tabular}
\end{figure}


  
Figure 4.17: Comparison of the cost of the OABST and OBT for very large data sets.

\includegraphics{Huff-HT_big.eps}


next up previous contents
Next: Conclusions Up: The Fast Implementation Builds Previous: Comparison of Different Compression
Sashka Davis;961;icsg6;
1999-01-14