4.7 Implementation with a search tree

Arrays, open hash tables and lists always require linear searches.

Java 2 declares the interface Comparable with the method compareTo() which should be antisymmetrical and compatible with equals() and reveal if the receiver is considered less than, equal, or greater than the argument.

adt.tree.Set stores a Comparable value inserted with add() in an Element which recursively references smaller and larger values which are also stored in Element objects:


adt/tree/Set.java
// adt/tree/Set.java
package adt.tree;
import adt.Container;
import adt.Visitable;
import adt.Visitor;
/** binary tree based container for <tt>Comparable</tt> with unique insertion.
 */
public class Set implements Container, Visitable {
 protected Element[] sub = { null };
 /** element structure, <b>not</b> limited to <tt>Comparable</tt>.
   */
 protected static class Element extends Set {
   public final Object info;
   /** create.
     */
   public Element (Object info) { this(info, null, null); }
   /** create and link.
     */
   public Element (Object info, Element left, Element right) {
     this.info = info; sub = new Element[] { left, right };
   }
   /** let this disappear.
       @return combined subtrees as a tree.
     */
   public Element unroot () {
     if (sub[0] == null) return sub[1];        // nothing at left... right
     if (sub[1] == null) return sub[0];        // nothing at right... left
     Element e = sub[0];                       // else start at left...
     while (e.sub[1] != null) e = e.sub[1];    // ...move to it's bottom right
     e.sub[1] = sub[1];                        // ...and attach my right there
     return sub[0];                            // ...all is below left
   }
   /** adjust count by current element.
     */
   public int count () { return super.count()+1; }
   /** permit symbolic dump.
     */
   public String toString () { return info+""; }
   /** receive a visitor.
     */
   public boolean visit (Visitor v) {
     return visit(0, v) && v.visit(info) && visit(1, v);
   }
 }
 


Again Element is hidden as a static nested class. It extends Set and therefore inherits the instance variable sub which is an array to reference one or more Element objects.

A Set references one Element with sub , namely the complete search tree. An Element references two Element objects with sub , namely the search tree with the smaller and the tree with the larger values.

Comparable is not the only means of comparison; therefore, an Element references an Object and not just a Comparable as a value — a hook for extensions.

Just as before, an Element can be constructed as a leaf without or as a root with two subtrees.

unroot() is the only tricky method: an Element unhooks itself as a root and returns it's subtrees connected in such a way that the compareTo() relation is preserved. The right subtree is attached at the bottom right of the left subtree, i.e., following it's maximum.

count() employs super and is mostly implemented in Set but it has to count the Element itself because Set must tally all subtrees:


adt/tree/Set.java
  /** inefficient, unless the container is empty.
     @return number of distinct objects in the container;
             zero, if the container is empty.
   */
 public int count () {
   int result = 0;
   for (int n = 0; n < sub.length; ++ n)
     if (sub[n] != null) result += sub[n].count();
   return result;
 }
 /** insert an object into the container.
     @throws ClassCastException if x is not <tt>Comparable</tt>.
   */
 public Object add (Object x) throws ClassCastException {
   return locate(x).add(x);
 }
 /** locate an object in the container.
     @throws RuntimeException if not found.
     @throws ClassCastException if x is not <tt>Comparable</tt>.
   */
 public Object find (Object x) throws ClassCastException {
   return locate(x).find();
 }
 /** remove an object from the container.
     @throws RuntimeException if not found.
     @throws ClassCastException if x is not <tt>Comparable</tt>.
   */
 public Object sub (Object x) throws ClassCastException {
   return locate(x).sub();
 }
 


The other Container methods again employ locate() to find an Element or room for a new value in the tree.

locate() returns an object that can be asked to insert, find, or remove:


adt/tree/Set.java
  /** operations on comparison state of locate().
   */
 protected interface Result {
   Object add (Object x);
   Object find ();
   Object sub ();
 }
 /** locate an object in the container.
     @param x object to be found.
     @return comparison state for further processing.
   */
 protected Result locate (Object x) throws ClassCastException {
   Comparable info = (Comparable)x;        // can still be null
   Set s = this; int at = 0;
   for (;;) {
     if (s.sub[at] == null) return s.new NotFound(at);
                // cannot involve null in compareTo...
     int c;		// ...make null less than anything					
     if (s.sub[at].info == null) c = info == null ? 0 : 1;
     else c = info == null ? -1 : info.compareTo(s.sub[at].info);
      if (c == 0) return s.new Found(at);
     s = s.sub[at]; at = c < 0 ? 0 : 1;
   }
 }
 


locate() starts with s at the Set and with the index at set to 0 . The argument has to be Comparable .

If sub[at] is itself null the required value does not exist, but it would belong in this place.

Otherwise, if the desired value is not null it has to be compared to the information at sub[at] . If it is null , the information could be null , too, in which case the value was found. If not, null is considered smaller than all values.

How the search continues depends on the result of the comparison. Either the value matches and the position for the value has thus been found, or the search continues in the lower Element in the left or righht subtree.

locate() collects interesting informations. A value is found at s.sub[at] or it should be stored there.

The context of the call to locate() decides what to do with this information. It is easy to add() if an empty place is found where the value should be. find() is happy, if the value is actually there, and sub() has to react differently in both cases.

This can all be untangled using a nest of if statements, but it is a lot simpler if locate() returns a Result object that can be asked to complete any of the three operations:


adt/tree/Set.java
  /** this.sub[at] is null and should be Element(x).
   */
 protected class NotFound implements Result {
   protected final int at;
   public NotFound (int at) { this.at = at; }
   public Object add (Object x) { sub[at] = new Element(x); return x; }
   public Object find () { throw new RuntimeException("not found"); }
   public Object sub () { throw new RuntimeException("not found"); }
 }
 /** this.sub[at].info is x.
   */
 protected class Found implements Result {
   protected final int at;
   public Found (int at) { this.at = at; }
   public Object add (Object x) { return sub[at].info; }
   public Object find () { return sub[at].info; }
   public Object sub () {
     Object result = sub[at].info; sub[at] = sub[at].unroot(); return result;
   }
 }
 


These are called member classes ; just like instance variables they are not marked with static .

As such they have access to the instance variables of the object — here it is s — for which they were created with new . An object of a member class implicitly references it's creator. If nothing is specified preceding new , the creator is this — this is more frequent than the use demonstrated here.

Due to it's construction a NotFound or Found object knows the Element or Set s , that was discovered in the search and thus has access to it's sub and only has to memorize the appropriate index at .

Without further analysis the continuation of each operation can now be implemented, one at a time. The only difficult problem is to remove a value, but this is quite simple by using unroot() in Element .

Top-level nested classes like Element are less expensive as member classes like NotFound or Found because they do not reference the creator. One should use member classes exactly if this implicit reference — which cannot be destroyed accidentally — is used. A tree or list Element should not be made a member class.

A construction such as the anonymous objects in adt.Test or the NotFound and Found objects in adt.tree.Set is sometimes called a closure : The construction freezes a particular situation which is later elaborated upon by a message to such an object.