Next: Fast Implementation Time Complexity
Up: Phase I
Previous: Phase I
Fast Implementation Model
As explained in the previous chapter, Phase I of the Hu-Tucker
algorithm combines compatible nodes from the initial sequence of
weights until only one node is left. Two or three Huffman
sequences could be merged as a result of the combination. The
implementation model that I will build was suggested by Knuth
in [Knu73b], page 444.
Let the initial sequence contain N terminal nodes.
Two abstract data structures will be needed to describe the
engine of the algorithm in
Phase I: an array and a priority queue.
The engine components are: two arrays T[N] and A[N];
one Master Priority Queue (MPQ);
and N-1 Huffman Sequence Priority Queues, also referred
as Huffman Priority Queues (HPQ).
T[N] is an array holding the initial sequence of terminal nodes.
Each element of the array T[N] must at least keep:
- The letter of the alphabet.
- The type of the element: terminal or internal node.
- The weight3.2 of the letter.
- Each terminal node should also know the two priority queues in which
it participates, i.e., each terminal node should be able to find in O(1)
time those elements of the MPQ,
which refer to priority queues holding the two Huffman sequences
which are delimited by this terminal node. See Figure 3.1.
The second array, A[N], represents the current working sequence.
Each element of A[N] holds the address of the corresponding
node of the working sequence.
Initially all the nodes in the working sequence are terminal nodes.
Thus the elements of A[N] hold the corresponding addresses of the
terminal nodes.
As the Combination phase is in progress A[N] will point to
both terminal and internal nodes. The number of nodes in
the working sequence steadily decreases. Thus some of the
elements of A[N] will become null.
Initially the relationship between A[N] and T[N] is as follows:
;
For example, assume that the weight distribution of the
nodes in the initial sequence is such that the
first combination that ought to be performed
is the pair T[1] and T[2].
After the combination
the working sequence has changed: the terminal nodes T[1] and T[2]no longer participate in, thus the array A[N] has changed its state:
,
I[1] - new node;
A[2] = null.
The rest of the elements of the array remain unchanged.
See Figure 3.2.
Figure 3.1:
The engine of the Hu-Tucker algorithm for Phase I. Initial State.
 |
The HPQs hold the elements of the Huffman
sequences at any stage of the Phase I of the algorithm.
Each individual priority queue represents one Huffman sequence.
Initially there are N-1 HPQs, and each of them
holds two adjacent terminal nodes from the initial sequence.
Let n denotes the number of nodes in the queue.
The priority queue data structure
representing the HPQ must be able to perform the following operations:
- Report in
time the two least elements of the priority queue and
their positions in the working sequence.
- It must be able to find in
time the two terminal nodes,
which define the boundary of the Huffman sequence it points to.
- Delete the minimum and arbitrary element of the priority
queue in
time.
- Merge two HPQs in
time.
The MPQ has as many elements as the current number of Huffman sequences,
i.e., the number of the HPQs.
The MPQ is responsible to choose a compatible pair from the current
working sequence according to the rules stated in Section 2.2.1.
Each element of the MPQ must at least hold the sum of the
two least elements from each individual HPQ, and their
positions3.3, designated as
MinSum, i and j,
on Figures 3.1
and 3.2.
Each element of the MPQ has a reference to the
root of the HPQ it represents.
The elements of the MPQ should not change their physical
locations, since each terminal node from the initial sequence
has a reference to those elements of the MPQ, that
are representing the Huffman sequences to which the terminal
node participate.
If a terminal node is to be combined with another node, the two Huffman
sequences it belongs to will be merged, thus those elements of the
MPQ must be deleted, and the corresponding priority queues merged.
The merge will generate a new Huffman sequence.
The node representing it will be inserted to the MPQ.
All the priority queue operations must be performed in at most O(1) or
time to achieve a reasonable performance.
Figure 3.2:
The state of the engine after combining nodes T[1] and T[2].
 |
To summarize the MPQ must provide the following operations:
- 1.
- Insert an element to the queue in
time.
- 2.
- Delete the minimum element of the queue in O(1) time.
- 3.
- Delete an arbitrary element of the queue in
time.
- 4.
- The physical locations of the priority queue elements should not change.
Figures 3.1 and 3.2 present the engine of
the algorithm in Phase I. For simplicity the author has assumed
that the first three weights of the initial sequence are the smallest.
Thus the HPQ representing the Huffman sequence
(w1,w2) has
the smallest combined sum and as a result the MPQ's element
representing it is on the top of the MPQ.
Figure 3.2 illustrates the state after the
first combination is performed. Note the change of the state of all
components of the engine after the combination.
Figure 3.3 represents an example of Phase I of the
Hu-Tucker algorithm applied on the following sequence of weights:
10, 2, 2, 1, 1, 4, 15, 17, 25, after performing three combinations.
Following is a detailed description of the first three iterations of the
algorithm. The MPQ has selected the pair of nodes T[4] and T[5] to be
combined first. The internal node I[1] is produced as a result of the
combination. Also, three priority queues are merged into one and
the duplicate terminal nodes participating are deleted.
In detail:
(T[3], T[4]), (T[4],T[5]), and
(T[5], T[6])
priority queues have to be merged.
The elements T[4] and T[5] are deleted from all three priority
queues prior to the merge. After the clean up followed by the merge
the new priority queue produced has the elements
(T[3], I[2], T[6]), where
I[2]=combine(T[4], T[5]).
The MPQ is updated: three elements are deleted and one is inserted.
The state of the array A[N] has changed as well: A[4] points to
I[1] and A[5] is null. At the second iteration of the algorithm the
terminal nodes T[2] and T[3] are selected.
The combination produces internal node I[2].
A[2] is redirected to point to I[2], and A[3] is null.
Again, two terminal nodes are combined and as a result three priority queues
have to be merged into one. Clean up of the duplicate nodes is
performed. Three elements of the MPQ are deleted and one is inserted.
At the third step the internal node I[1] is combined with
the terminal node T[6]. Two priority queues are merged into one:
(T[1], I[2], I[3], T[6]) and
(T[6], T[7]).
Terminal node T[6] is deleted from both of the priority
queues prior to the merge. The new internal node I[3]
is inserted to the new priority queue.
The MPQ and the array A[N] are updated.
In total N-1=9-1=8 combination steps are performed,
and only one element is left in the working sequence
which is the root of the tree. Thus only the first element
of the array A[N] is not null.
The tree is built and the Combination phase has terminated.
Figure 3.4 illustrates the engine of the algorithm
in its final state.
Figure 3.3:
The engine of the Hu-Tucker algorithm in progress.
 |
Figure 3.4:
Completed Combination phase.
 |
Hu-Tucker Algorithm Phase I, Combination
Initialization
Populate the array T[N] and initialize A[N]
Create and initialize (N-1) HPQs and the MPQ
End:Initialization
Main Loop
for (
++)
Extract_Min(MPQ): get the two nodes of the working sequence, which
are on the top of MPQ and will have to be combined.
Delete those two nodes from all HPQs where they participate
Merge all those HPQs into a new
,
if necessary
Update the MPQ: delete the merged HPQs from MPQ
Combine the two nodes into a new node
Update the array A[N]
Insert the new node into the
Insert the
to MPQ
End: Main Loop
End: Phase I Completed
Footnotes
- ... weight3.2
- The weight is a representation of the frequency of
usage of the letter in the alphabet.
- ...
positions3.3
- This information is needed to resolve ties.
See Section 2.2.1.
Next: Fast Implementation Time Complexity
Up: Phase I
Previous: Phase I
Sashka Davis;961;icsg6;
1999-01-14