Package step10

Step 10: Type Checking.

See:
          Description

Interface Summary
mini.yyActions interface between grammar and actions.
mini.yyInput must be implemented by a scanner object to supply input to the parser.
 

Class Summary
mini a little lexically scoped language to study the typing of variables.
mini.yyLex  
mini.yyTree tree factory.
mini.yyTree.Add  
mini.yyTree.And  
mini.yyTree.AndIf  
mini.yyTree.Begin  
mini.yyTree.Call  
mini.yyTree.Deref  
mini.yyTree.Div  
mini.yyTree.Eq  
mini.yyTree.Ge  
mini.yyTree.Gt  
mini.yyTree.If  
mini.yyTree.Le  
mini.yyTree.Let  
mini.yyTree.Letrec  
mini.yyTree.Lit  
mini.yyTree.Lsh  
mini.yyTree.Lt  
mini.yyTree.Minus  
mini.yyTree.Mod  
mini.yyTree.Mul  
mini.yyTree.Ne  
mini.yyTree.Neg  
mini.yyTree.Not  
mini.yyTree.Or  
mini.yyTree.OrIf  
mini.yyTree.Param  
mini.yyTree.Proc  
mini.yyTree.Rsh  
mini.yyTree.Sub  
mini.yyTree.Type  
mini.yyTree.TypeList  
mini.yyTree.Var  
mini.yyTree.Xor  
type strongly type-checks and rewrites mini.
type.BAnd and of two type.BooleanT trees.
type.BAndIf and-if of two type.BooleanT trees.
type.BBegin block ending in type.BooleanT tree.
type.BCall call with Boolean result.
type.BDeref deref String to Boolean.
type.BEq equal of two type.BooleanT trees.
type.BIf if with type.BooleanT alternatives.
type.BLet let with type.BooleanT body.
type.BLetrec letrec with type.BooleanT body.
type.BNe not-equal of two type.BooleanT trees.
type.BNot negate a type.BooleanT tree.
type.BooleanT base class for boolean operations, itself used to type an Boolean leaf.
type.BOr or of two type.BooleanT trees.
type.BOrIf or-if of two type.BooleanT trees.
type.BVar var with type.BooleanT initializer.
type.BXor xor of two type.BooleanT trees.
type.DAdd add two type.DoubleT trees.
type.DBegin block ending in type.DoubleT tree.
type.DCall call with Double result.
type.DDeref deref String to Double.
type.DDiv divide two type.DoubleT trees.
type.DEq equal of two type.DoubleT trees.
type.DGe greater-equal of two type.DoubleT trees.
type.DGt greater-then of two type.DoubleT trees.
type.DIf if with type.DoubleT alternatives.
type.DLe less-equal of two type.DoubleT trees.
type.DLet let with type.DoubleT body.
type.DLetrec letrec with type.DoubleT body.
type.DLt less-then of two type.DoubleT trees.
type.DMinus sign-change an type.DoubleT tree.
type.DMul multiply two type.DoubleT trees.
type.DNe not-equal of two type.DoubleT trees.
type.DoubleT base class for double operations, itself used to type a Double leaf.
type.DSub subtract two type.DoubleT trees.
type.DVar var with type.DoubleT initializer.
type.Env environment: block-structured map of keys to values.
type.I2D convert type.IntegerT to type.DoubleT.
type.IAdd add two type.IntegerT trees.
type.IAnd bit-and two type.IntegerT trees.
type.IBegin block ending in type.IntegerT tree.
type.ICall call with Integer result.
type.IDeref deref String to Integer.
type.IDiv divide two type.IntegerT trees.
type.IEq equal of two type.IntegerT trees.
type.IGe greater-equal of two type.IntegerT trees.
type.IGt greater-then of two type.IntegerT trees.
type.IIf if with type.IntegerT alternatives.
type.ILe less-equal of two type.IntegerT trees.
type.ILet let with type.IntegerT body.
type.ILetrec letrec with type.IntegerT body.
type.ILsh left shift two type.IntegerT trees.
type.ILt less-then of two type.IntegerT trees.
type.IMinus sign-change an type.IntegerT tree.
type.IMod remainder of two type.IntegerT trees.
type.IMul multiply two type.IntegerT trees.
type.INe not-equal of two type.IntegerT trees.
type.INeg bit-complement an type.IntegerT tree.
type.IntegerT base class for integer operations, itself used to type an Integer leaf.
type.IOr bit-or two type.IntegerT trees.
type.IRsh right shift two type.IntegerT trees.
type.ISub subtract two type.IntegerT trees.
type.IVar var with type.IntegerT initializer.
type.IXor bit-xor two type.IntegerT trees.
type.PBegin block ending in type.ProcT tree.
type.PCall call with type.ProcT result.
type.PDeref deref String to type.ProcT.
type.PIf if with type.ProcT alternatives.
type.PLet let with type.ProcT body.
type.PLetrec letrec with type.ProcT body.
type.ProcT base class for procedures, itself used to represent a mini.yyTree.Proc.
type.PVar var with type.ProcT initializer.
type.SAdd concatenate two type.StringT trees.
type.SBegin block ending in type.StringT tree.
type.SCall call with String result.
type.SDeref deref String to String.
type.SEq equal of two type.StringT trees.
type.SGe greater-equal of two type.StringT trees.
type.SGt greater-then of two type.StringT trees.
type.SIf if with type.StringT alternatives.
type.SLe less-equal of two type.StringT trees.
type.SLet let with type.StringT body.
type.SLetrec letrec with type.StringT body.
type.SLt less-then of two type.StringT trees.
type.SNe not-equal of two type.StringT trees.
type.StringT base class for string operations, itself used to type a String leaf.
type.SVar var with type.StringT initializer.
type.T base class for typed branch nodes.
 

Exception Summary
mini.yyException thrown for irrecoverable syntax errors and stack overflow.
 

Package step10 Description

Step 10: Type Checking.

Setup

The objective of type checking is to ensure (prior to execution) that a program will not apply an operation to data for which the operation is not intended. This step combines algorithms from Friedman et al., Essentials of Programming Languages with the technique of typing by rewriting from step 9.

mini.pj is a frontend which recognizes a very small Scheme-inspired language with boolean, floating point, integer, and string literals, recursive procedure values, infix expressions, a selection control structure, and lexical scoping. euclid.txt and even.txt are typical programs.

$ mkdir step10
$ java -jar pj.jar -jay -scanner:static - -tree:static - mini.pj > step10/mini.java
$ javac step10/mini.java
$ java step10.mini
let x=2 in if true then 1 else x
[[[x, [2]]], [[true], [1], [x]]]

type.jag is a visitor which performs strong type checking and rewrites the tree with typed branch nodes, e.g., mini.yyTree.If is replaced by type.IIf. The visitor is a tree factory and just as in step 9 a simple awk program expands a table into the factory methods and classes for the typed tree.

$ java -jar jag.jar < step10/type.jag > step10/type.java
$ javac step10/type.java
$ java step10.type
let x=2 in if true then 1 else x
generic    typed

Let
  ArrayList
    Var
      String x
      Lit
        Integer 2
  If
    Lit
      Boolean true
    Lit
      Integer 1
    Deref
      String x
ILet
  ArrayList
    IVar
      String x
      IntegerT
        Integer 2
  IIf
    BooleanT
      Boolean true
    IntegerT
      Integer 1
    IDeref
      String x

Constant Expressions

type.jag rewrites constant expressions and inserts conversion nodes if required, just as it was done in step 9.

$ java step10.type
"high" < "low" || 1. + 2 == 3
generic    typed

OrIf
  Lt
    Lit
      String "high"
    Lit
      String "low"
  Eq
    Add
      Lit
        Double 1.0

      Lit
        Integer 2

    Lit
      Integer 3
BOrIf
  SLt
    StringT
      String "high"
    StringT
      String "low"
  DEq
    DAdd
      DoubleT
        Double 1.0
      I2D
        IntegerT
          Integer 2
    I2D
      IntegerT
        Integer 3

type.T is the base class for typed branch nodes. A literal such as 3 is represented as an Integer attached to an type.IntegerT. The latter is a branch node marking the literal as an integer value. An operation such as addition is represented by a typed branch node such as type.DAdd which marks the operation as an addition of floating point values with a floating point result.

type.DAdd is a subclass of type.DoubleT. Therefore, the result of an operation is a floating point value exactly if the branch node, e.g., a type.DAdd object, is an instance of type.DoubleT.

There are no mixed mode operations. If necessary a typed branch node such as type.I2D is inserted to represent a conversion from an integer to a double value.

Sequence

mini.pj supports a block consisting of a sequence of expressions and selective execution of one of two expressions depending on a boolean value. Additionally, there are procedure calls as discussed below.

A block is just another expression, i.e., it cannot be empty because it's value and type will be taken from the last expression in the sequence. type.jag rewrites a block accordingly; therefore, a class such as type.IBegin will represent a block where the last expression produces an integer value. An unnecessary ArrayList is silently removed.

$ java step10.type
begin true; 1.0; "two"; 3 end
generic    typed

Begin
  ArrayList
    Lit
      Boolean true
    Lit
      Double 1.0
    Lit
      String "two"
    Lit
      Integer 3
IBegin

  BooleanT
    Boolean true
  DoubleT
    Double 1.0
  StringT
    String "two"
  IntegerT
    Integer 3

Selection

mini.pj supports selective execution of one of two expressions depending on a boolean value. A selection is just another expression; again, it cannot be empty because it's value and type will be taken from whatever expression is selected. type.jag rewrites a selection accordingly; therefore, a class such as type.DBegin will represent a selection where the selected expression produces a floating point value.

$ java step10.type
if true then 1.0 else 2.0
generic    typed

If
  Lit
    Boolean true
  Lit
    Double 1.0
  Lit
    Double 2.0
DIf
  BooleanT
    Boolean true
  DoubleT
    Double 1.0
  DoubleT
    Double 2.0

Strong typing requires that the result type of a selection be known, regardless of whether the then or else part is selected. Therefore, neither part can be empty and they must produce the same type. Of course, the condition must produce a bool result.

Type comparison is best implemented by delegation. type.T.type() returns an object that is unique for a type. A class representing literals such as type.IntegerT arranges for type.T.type() to return it's class object, and all derived classes such as type.IAdd inherit this behaviour, i.e., they will also return the class object type.IntegerT. Class objects are unique and implement Object.equals(java.lang.Object) as an identity comparison; therefore, comparing the objects returned by type.T.type() using Object.equals(java.lang.Object) is a flexible solution.

Variables

mini.pj supports let in the style of Scheme: a set of values is bound to a set of names. The names are lexically scoped and can be used in the dependent expression of the let. Duplicate names are silently permitted; subsequent names supersede prior names. The value and type of the let result from the dependent expression, the value and type of each name result from the expression bound to it. type.jag rewrites let accordingly; therefore, a class such as type.IVar will represent a variable bound to an integer value and type.SLet will represent a let where the dependent expression produces a string value.

$ java step10.type
let x="hello, "; y=3 in x + "world!"
generic    typed

Let
  ArrayList
    Var
      String x
      Lit
        String "hello, "
    Var
      String y
      Lit
        Integer 3
  Add
    Deref
      String x
    Lit
      String "world!"
SLet
  ArrayList
    SVar
      String x
      StringT
        String "hello, "
    IVar
      String y
      IntegerT
        Integer 3
  SAdd
    SDeref
      String x
    StringT
      String "world!"

An environment, i.e., a stack of hashtables, is used to model lexical scoping and map names to their types:

env, the environment upon entry to let, contains all names defined in a more global scope. This environment is used to check all initializer expressions and determine their types. Therefore, the initializers cannot use the variables that are defined in this let.

A new environment, created by pushing a hashtable with the names defined in the current let on top of env, is used to check the dependent expression of the let and propagate the type as the result type of the let.

The top hashtable maps the (unique) names to the types determined for the initializers. If a name is found in the dependent expression, the top hashtable is searched first, i.e., the names from the let will be found first. Once the dependent expression has been checked, the new environment, i.e., the top hashtable, is discarded, i.e., the names in the let have a scope which is restricted exactly to the dependent expression of the let.

Names can only be used if they are in scope, i.e., by inspecting the environment mini.yyTree.Deref can be rewritten as type.IDeref to indicate that a variable is bound to an integer value.

Procedures

mini.pj supports letrec reminiscent of Scheme: a set of procedure values is bound to a set of names. All names are lexically scoped. All procedure names can be used in the dependent expression of the letrec. All procedure names and the applicable parameter names can be used in the body expression of a procedure. Duplicate names are silently permitted; subsequent names supersede prior names. The value and type of the letrec result from the dependent expression, the type of each procedure name results from the types specified in the procedure header, the type of a parameter name is specified in the procedure header. type.jag rewrites letrec accordingly; therefore, a class such as type.BLetrec will represent a letrec where the dependent expression produces a boolean value.

$ java step10.type
letrec bool eq(int a, double b) = a == b in eq(1, 1.0)
generic    typed

Letrec
  ArrayList
    Proc
      String bool
      String eq
      ArrayList
        Param
          Type
            String int
            ArrayList
          String a
        Param
          Type
            String double
            ArrayList
          String b
      ArrayList
      Eq

        Deref
          String a
        Deref
          String b
  Call
    String eq
    ArrayList
      ArrayList
        Lit
          Integer 1
        Lit
          Double 1.0
BLetrec
  ArrayList
    ProcT bool(int,double)

      String eq












      DEq
        I2D
          IDeref
            String a
        DDeref
          String b
  BCall
    String eq
    ArrayList
      ArrayList
        IntegerT
          Integer 1
        DoubleT
          Double 1.0

Type checking mostly amounts to careful building of environments:

The dependent expression of the letrec has access to an environment extended with all procedure names and their types taken from the procedure headers. A procedure type consists of the result type and the list of parameter types (if any). The type of the dependent expression is propagated as type of the letrec.

The body of a procedure has access to the same environment, further extended with the procedure's own parameter names bound to their types as declared in the header. The type returned by the procedure body must be the type which was computed as the result type of the procedure from the procedure header.

A procedure call is checked as follows: The initial name must be bound to a procedure type. The result type of this procedure determines the result of the call. The list of parameter types must match the list of argument types.

Procedure values can be used just like any other value. compose.txt demonstrates function composition and illustrates how procedure types are represented as types for parameters and how a procedure is returned as a result.

$ java step10.type < compose.txt
generic    typed

Letrec
  ArrayList
    Proc
      String double
      String compose
      ArrayList
        Param
          Type
            String double
            ArrayList
              TypeList
                ArrayList
                  Type
                    String int
                    ArrayList
          String f
        Param
          Type
            String int
            ArrayList
              TypeList
                ArrayList
                  Type
                    String string
                    ArrayList
          String g
        Param
          Type
            String string
            ArrayList
              TypeList
                null
          String h
      ArrayList
        TypeList
          ArrayList
            Type
              String string
              ArrayList
      Letrec
        ArrayList
          Proc
            String double
            String result
            ArrayList
              Param
                Type
                  String string
                  ArrayList
                String s
            ArrayList
            Call
              String f
              ArrayList
                ArrayList
                  Call
                    String g
                    ArrayList
                      ArrayList
                        Add
                          Call
                            String h
                            ArrayList
                              null
                          Deref
                            String s
        Deref
          String result
  Deref
    String compose
PLetrec double(double(int),int(string),string())(string)
  ArrayList
    ProcT double(double(int),int(string),string())(string)

      String compose


































      PLetrec double(string)
        ArrayList
          ProcT double(string)

            String result







            DCall
              String f
              ArrayList
                ArrayList
                  ICall
                    String g
                    ArrayList
                      ArrayList
                        SAdd
                          SCall
                            String h
                            ArrayList
                              null
                          SDeref
                            String s
        PDeref double(string)
          String result
  PDeref double(double(int),int(string),string())(string)
    String compose

curry.txt shows a deeper nest of procedures returning procedures and how such a nest can be called. The rewritten tree demonstrates how the procedure types are successively unraveled.

$ java step10.type < curry.txt
generic    typed

Let
  ArrayList
    Var
      String curry
      Letrec
        ArrayList
          Proc
            String string
            String u
            ArrayList
              Param
                Type
                  String int
                  ArrayList
                String i
            ArrayList
              TypeList
                ArrayList
                  Type
                    String double
                    ArrayList
              TypeList
                ArrayList
                  Type
                    String bool
                    ArrayList
              TypeList
                null
            Letrec
              ArrayList
                Proc
                  String string
                  String v
                  ArrayList
                    Param
                      Type
                        String double
                        ArrayList
                      String d
                  ArrayList
                    TypeList
                      ArrayList
                        Type
                          String bool
                          ArrayList
                    TypeList
                      null
                  Letrec
                    ArrayList
                      Proc
                        String string
                        String w
                        ArrayList
                          Param
                            Type
                              String bool
                              ArrayList
                            String b
                        ArrayList
                          TypeList
                            null
                        Letrec
                          ArrayList
                            Proc
                              String string
                              String x
                              null
                              ArrayList
                              If
                                AndIf
                                  Eq

                                    Deref
                                      String i
                                    Deref
                                      String d
                                  Deref
                                    String b
                                Lit
                                  String "true"
                                Lit
                                  String "false"
                          Deref
                            String x
                    Deref
                      String w
              Deref
                String v
        Deref
          String u
  Call
    String curry
    ArrayList
      ArrayList
        Lit
          Integer 1
      ArrayList
        Lit
          Double 2.0
      ArrayList
        Lit
          Boolean true
      null
SLet
  ArrayList
    PVar string(int)(double)(bool)()
      String curry
      PLetrec string(int)(double)(bool)()
        ArrayList
          ProcT string(int)(double)(bool)()

            String u



















            PLetrec string(double)(bool)()
              ArrayList
                ProcT string(double)(bool)()

                  String v














                  PLetrec string(bool)()
                    ArrayList
                      ProcT string(bool)()

                        String w









                        PLetrec string()
                          ArrayList
                            ProcT string()

                              String x


                              SIf
                                BAndIf
                                  DEq
                                    I2D
                                      IDeref
                                        String i
                                    DDeref
                                      String d
                                  BDeref
                                    String b
                                StringT
                                  String "true"
                                StringT
                                  String "false"
                          PDeref string()
                            String x
                    PDeref string(bool)()
                      String w
              PDeref string(double)(bool)()
                String v
        PDeref string(int)(double)(bool)()
          String u
  SCall
    String curry
    ArrayList
      ArrayList
        IntegerT
          Integer 1
      ArrayList
        DoubleT
          Double 2.0
      ArrayList
        BooleanT
          Boolean true
      null

Custom Types

Just as type.IntegerT is the base class for all operations with an integer result, type.ProcT is the base class for all operations with a procedure value as a result, e.g., a type.PCall or type.PLetrec returning a procedure.

However, there is a big difference: integers are a built-in type and the class object of type.IntegerT uniquely represents that type when type equality is checked by delegation (see Selection). Procedure values, on the other hand, belong to user-defined types and objects have to be created to represent each such type. Procedure types are equal if they have equal return and parameter types.

type.ProcT is also used to represent a procedure type for type equality checking. When a type.ProcT is created from a procedure definition to rewrite mini.yyTree.Proc it delegates to itself. When a branch node such as type.PLet (which returns a procedure value) is created using create() the delegate is propagated.

If let binds a procedure value to a name, mini.yyTree.Var is rewritten as type.PVar, the delegate is propagated, and the delegate (which will delegate to itself) is bound as the type of the name in the current environment.

Summary

Type checking ensures that operations will only be applied to values for which they are intended.

Type checking can be accomplished by rewriting the program tree with branch nodes that indicate the types of the values which they will produce during execution; i.e., type checking amounts to a symbolic execution of the program with types playing the role of values while recording all results. Apart from processing declarations, rewriting proceeds in postorder and descends all subtrees, even in selections.

Each built-in type can be represented by the class object of a base class for all branch nodes returning that type.

User-defined types require a new object for each new type. Type comparison in general is delegated to these objects. The delegates have to be available in all branch nodes. The classes of these delegates can be used to establish categories of user-defined types, e.g., procedures can be distinguished from arrays.