Copyright RIT 2009
$Id: writeup.xml,v 1.4 2009/04/27 21:16:26 vcss233 Exp $

The Heapsort
Source: Heap sort animation (http://de.wikipedia.org/wiki/Image:Sorting_heapsort_anim.gif)
In this lab you will learn how an array can be used to store a binary tree of data for relatively fast sorting.
Review lecture notes by clicking here (http://www.cs.rit.edu/~vcss23x/Lecture/Week 25 - Sorting/notes.html)
Read Goodrich and Tamassia, Chapter 11.
Check out this heapsort applet that tests your understanding of the heapify process here (http://www.iti.fh-flensburg.de/lang/algorithmen/sortieren/heap/heapen.htm)
Answer the questions found throughout this section by editing the file act-questions.html. Submit it as part of one of the activities (see the submission instructions).
Heapsort is an in-place, comparison based sorting algorithm. The data to be sorted can be viewed as a binary tree with the following properties:

The Heapform
The heap structure you will build is called a binary max heap. This means a node can have at most 2 children, and the data in the parent node must be greater than or equal to all of its descendant nodes. There is no relationship between the data values of the siblings of any parent (i.e. the right child does not have to be greater/less than the left child). The following diagram illustrates a few examples of invalid max heap structures:

Invalid Heaps
Since the data is initially unordered and most likely violates the rules for a valid max heap, it must be heapified. The heapify process starts at the last parent node and it continually sifts down the smallest descendant at each level. If a child has a larger value than the parent, the largest child value is swapped with the parent. That node is then sifted down again until no sift occurs, or there are no more children. Each parent node is sifted in the same manner from the last until the root, at which point the heap structure is valid:

Heapifying the data
The algorithm for sifting down the data requires three things, the array of data, and the start and ending indices in the array to sift between. It works as follows:
|
The heapify routine uses siftDown
to generate a valid max heap where
the largest value will be in the root of the tree, and every parent
will have a value greater than or equal to its descendants:
|
Once the heap structure has been built. The largest element in the collection resides in the root node. It is swapped with the last leaf node so that it resides in the last position in the array. The heap is then resifted from the root, excluding the last position. This process is repeated until all of the nodes except the root have been excluded. At this point, the data is completely sorted:

Heapsort (part1)

Heapsort (part2)
The sort routine first calls heapify
to generate a valid max heap structure. It then
uses siftDown to
perform the in-place sort. The trick is that once the largest element
is stored in its final poisition, the end marker is decremented
so that it will remain untouched for the remainder of the sorting:
|
For the first activity you will be answering questions about the heap sort algorithm. You will be required to make diagrams, on paper, that you will hand in to your instructor before continuing on with the lab.
Assume you have a heap array of 8 integers that have just been
heapified:
[15, 11, 8, 3, 5, 4, 2, 1].
The root element has been moved to its final position,
and it is time to reheapify the array:
[1, 11, 8, 3, 5, 4, 2, 15].
What does the resulting array look like after the new root value
has been sifted down in the array?
Assume you have the following array of three integers: [4, 5, 2].
Show the heap structure that results from heapifying the data. To
receive full credit you should show all your work, not just the final
form.
Using the heapified array from the previous question, show the heap array at each step in the sorting process before the largest element is swapped into its final position and the new root is sifted down. The final step should have the fully sorted data.
Assume you have the following array of 7 integers:
[11, 5, 16, 23, 19, 1, 25].
Show the heap structure that results from heapifying the data. To
receive full credit you should show all your work, not just the final
form.
Using the heapified array from the previous question, show the heap array at each step in the sorting process before the largest element is swapped into its final position and the new root is sifted down. The final step should have the fully sorted data.
Show your work to your instructor and make sure
that everything checks out ok.
Put your text based answers into the
act-questions.txt file and submit.
It is ok that you haven't completed all the questions in this
file, you will be resubmitting at the end of the last activity.
|
In this activity you will implement the heapsort routines, and you will write a main test routine that demonstrates it sorts correctly. Take a moment to familiarize yourself with the HeapSort class.
It goes without saying that you are not allowed to use any of the
pre-supplied sorting routines, (i.e. Collections.sort()).
However, since you will be using an ArrayList
to perform the in-place sorting, there is a very useful routine that
you are allowed to use
that swaps the elements at two locations:
Collections.swap(java.util.List, int, int).
To begin with,
you should write a main routine that creates an ArrayList
of unsorted integers that calls the private siftDown routine.
Use your answer from question 1.1 to verify that routine sifts nodes
down correctly before continuing.
Next, implement the private heapify routine and verify that you get the same
results as your answers for questions 1.2 and 1.4. You should have a valid max
heap where each parent node has a value greater than its descendants.
Finally, implement the sort routine and verify that it
heapifies and sorts the
data. For debugging purposes you can insert a statement at the beginning
of the while loop in sort that prints out the array each time
before the root node is swapped and sifted. This should yield a result
that is similar to your answers for questions 1.3 and 1.5. Don't forget
to comment out this debug statement for your final submission, as we will
be testing your sorting routines with our own test programs.
When you are sure that your heap sort is working properly and that your test program sufficiently tests all features of your implementation, submit your work by executing the following command:
|
Now that you are comfortable with the heapsort algorithm and its implementation, answer the following questions regarding its complexity.
Once a root element has been put in its final position, how much time does it take
to re-heapify the structure so that the next removal can take place?
In other words, what is the time complexity of a single invocation
of siftDown assuming a heap structure of size N?
In your estimation, what is the time complexity of
the heapify routine that uses the
siftDown method
(using big-O notation)?
Assume the method is being called with the entire
range of an array of size N. Explain
why you think this is so. Think carefully
about this question before answering. For example,
is it true that N is constant across all sifts
or does it change?
In general, do you think there are any data sets
of equal size for which heapsort performs better
or worse? For example, quicksort was shown to
be O(N2) if the pivot was poorly chosen
each time. Explain your answer.
What do you think the overall time complexity for heapsort is? Why do you feel this way?
When you are ready to submit your answers, using the following command:
|
Grade Breakdown: