Package step7

Step 7: Traversing Trees.

See:
          Description

Interface Summary
expr.yyActions interface between grammar and actions.
expr.yyInput must be implemented by a scanner object to supply input to the parser.
 

Class Summary
eval evaluate expr tree with integer arithmetic.
expr representing an arithmetic expression as a very simple tree.
expr.yyLex  
expr.yyTree tree factory.
expr.yyTree.Add  
expr.yyTree.Div  
expr.yyTree.Minus  
expr.yyTree.Mod  
expr.yyTree.Mul  
expr.yyTree.Sub  
postfix output expr tree in postfix notation.
stack evaluate expr tree with integer arithmetic.
traverse output expr tree in postfix notation.
 

Exception Summary
expr.yyException thrown for irrecoverable syntax errors and stack overflow.
 

Package step7 Description

Step 7: Traversing Trees.

Postfix Notation

Using pj it is easy to implement a frontend which creates a tree from a source language program. Further processing of such a program amounts to manipulating the tree. Usually this will require one or more tree traversals.

As an example consider expr.pj, a frontend which represents the usual arithmetic expressions using classes such as expr.yyTree.Add based on Node for branch nodes and Integer for leaves. Processing with pj results in a frontend that can display an expression as a nested tree:

$ mkdir step7
$ java -jar pj.jar -jay -scanner:static - -tree:static - expr.pj > step7/expr.java
$ javac step6/Node.java step7/expr.java
$ java step7.expr
- 10 + 20 * 30 / (40 % 7)
Add
  Minus 10
  Div
    Mul 20 30
    Mod 40 7

Displaying an expression in postfix notation, i.e., with operators following their operands, requires a postorder traversal of the tree. Consider traverse.java:

$ javac step7/traverse.java
$ java step7.traverse
- 10 + 20 * 30 / (40 % 7)
 10 Minus 20 30 Mul 40 7 Mod Div Add

The code is relatively ugly because the function implementing the recursive traversal has to accept any kind of node and react differently depending on the class:

public void visit (Object node) {
  if (node instanceof Number)
    System.out.print(" "+node);
  else if (node instanceof List) {
    for (int n = 0; n < ((List)node).size(); ++ n)
      visit(((List)node).get(n));
    System.out.print(" "+classname(node));
  }
}

jag

jag is a preprocessor which simplifies coding tree traversals. Input to jag such as postfix.jag consists of a table of patterns and actions surrounded by a prolog and an epilog of Java code:

package step7;
import java.util.List;
import step7.expr.yyLex;

public class postfix implements jag.Visitor {
%%

List: Object Object { {$1}; {$2}; `" "+classname($0)`; }
  | Object          { {$1}; `" "+classname($0)`; };

Number:             { `" "+$0`; };

%%
  public static void main (String[] args) throws Exception {
    new postfix().visit(new expr().yyparse(new yyLex(System.in)));
  }
}
A pattern contains the classname of a root node and a list of descendant classnames; patterns and actions for the same root class are collated into a list.

Each pattern is usually followed by an action which can refer to the root node as $0 and to the descendants as $1, etc., by position. There are two shorthands: {$1} represents the result of a recursive call to the visit function for the first descendant, etc. A string in backquotes is output. The code above traverses the subtrees and then outputs the classname of the root node preceded by a blank.

Patterns are matched in such a way that inheritance acts as a wildcard mechanism: When the visit function is called for a node, it searches for the class of this node among all patterns. If the exact class cannot be found, interfaces and superclasses are searched for, etc.

Once the node class has been matched, the list of patterns for the match is searched sequentially for a list of descendant classnames which match the descendants of the given node. There is no 'best' match; the first list with the proper number of names wins as long as the node's descendants can be derived from the names.

In the example above, List and Number are wildcards to match root node classes and Object is a wildcard which will match any kind of single descendant.

postfix.jag has the same effect as the previous example:

$ java -jar jag.jar < step7/postfix.jag > step7/postfix.java
$ javac step7/postfix.java
$ java step7.postfix
- 10 + 20 * 30 / (40 % 7)
 10 Minus 20 30 Mul 40 7 Mod Div Add

Evaluation

The visit function generated by jag returns a result object which can be set by assigning to $$ in an action. eval.jag shows how this can be used to evaluate arithmetic expressions:

$ java -jar jag.jar < step7/eval.jag > step7/eval.java
$ javac step7/eval.java
$ java step7.eval
- 10 + 20 * 30 / (40 % 7)
        110

Of course, the visit function now must perform a different computation for each root node class:

package step7;
import step7.expr.yyLex;
import step7.expr.yyTree.*;

public class eval implements jag.Visitor {
%%

Add: Object Object {
    $$ = new Integer(((Number){$1}).intValue() + ((Number){$2}).intValue());
  };
       //...

Minus: Object {
    $$ = new Integer(- ((Number){$1}).intValue());
  };

Number: { $$ = ($0); };

%%
  public static void main (String[] args) throws Exception {
    System.out.println("\t"+
      new eval().visit(new expr().yyparse(new yyLex(System.in))));
  }
}

The shorthands which jag expands in actions are quite subtle:

{$1}
$1;
both express recursive calls on the visit functions. The first notation is preferred because it provides access to the result of the call. This result is an Object and will usually have to be cast to a more useful type.
$1
$0
only express access to the descendant or the root node itself. jag automatically casts these values to the types expressed in the pattern which selected the action.
$$ provides access to the result of the visit function. jag will not apply a cast.

Using State

stack.jag shows that a visitor can have a state which the recursive calls to the visit function share:

public class stack implements jag.Visitor {
  protected int result; // result of current visit()
%%

Add: Object Object {{ $2; int right = result; $1; result += right; }};
  // ...
Minus: Object { $1; result = - result; };

Number: { result = $0.intValue(); };

%%
  public static void main (String[] args) throws Exception {
    stack s = new stack();
    s.visit(new expr().yyparse(new yyLex(System.in)));
    System.out.println("\t"+s.result);
  }
}
Here, following a call to the visit function, result contains the value computed for the subtree; this avoids boxing and unboxing these values. It turns out to be more efficient to evaluate the right subtrees first -- this is easily accomplished in each action.

An action is expanded as a case in a switch statement (the line number in the source file is the case number). An extra set of braces is required if the action should contain a scope for local names.

$ java -jar jag.jar < step7/stack.jag > step7/stack.java
$ javac step7/stack.java
$ java step7.stack
- 10 + 20 * 30 / (40 % 7)
        110