|PREV PACKAGE NEXT PACKAGE||FRAMES NO FRAMES|
|mini.yyActions||interface between grammar and actions.|
|mini.yyInput||must be implemented by a scanner object to supply input to the parser.|
|mini||a little lexically scoped language to study the typing of variables.|
|type||strongly type-checks and rewrites
|type.BAnd||and of two
|type.BAndIf||and-if of two
|type.BBegin||block ending in
|type.BEq||equal of two
|type.BNe||not-equal of two
|type.BooleanT||base class for boolean operations,
itself used to type an
|type.BOr||or of two
|type.BOrIf||or-if of two
|type.BXor||xor of two
|type.DBegin||block ending in
|type.DEq||equal of two
|type.DGe||greater-equal of two
|type.DGt||greater-then of two
|type.DLe||less-equal of two
|type.DLt||less-then of two
|type.DNe||not-equal of two
|type.DoubleT||base class for double operations,
itself used to type a
|type.Env||environment: block-structured map of keys to values.|
|type.IBegin||block ending in
|type.IEq||equal of two
|type.IGe||greater-equal of two
|type.IGt||greater-then of two
|type.ILe||less-equal of two
|type.ILsh||left shift two
|type.ILt||less-then of two
|type.IMod||remainder of two
|type.INe||not-equal of two
|type.IntegerT||base class for integer operations,
itself used to type an
|type.IRsh||right shift two
|type.PBegin||block ending in
|type.ProcT||base class for procedures,
itself used to represent a
|type.SBegin||block ending in
|type.SEq||equal of two
|type.SGe||greater-equal of two
|type.SGt||greater-then of two
|type.SLe||less-equal of two
|type.SLt||less-then of two
|type.SNe||not-equal of two
|type.StringT||base class for string operations,
itself used to type a
|type.T||base class for typed branch nodes.|
|mini.yyException||thrown for irrecoverable syntax errors and stack overflow.|
Step 10: Type Checking.
The objective of type checking is to ensure (prior to execution)
that a program will not apply an operation to data for which the
operation is not intended. This step combines algorithms from Friedman
et al., Essentials of Programming Languages with the technique
of typing by rewriting from
mini.pj is a frontend which recognizes a very small Scheme-inspired language with boolean, floating point, integer, and string literals, recursive procedure values, infix expressions, a selection control structure, and lexical scoping. euclid.txt and even.txt are typical programs.
$ mkdir step10 $ java -jar pj.jar -jay -scanner:static - -tree:static - mini.pj > step10/mini.java $ javac step10/mini.java $ java step10.mini let x=2 in if true then 1 else x [[[x, ]], [[true], , [x]]]
a visitor which performs strong type checking and rewrites the
tree with typed branch nodes, e.g.,
is replaced by
type.IIf. The visitor is a tree
factory and just as in
step 9 a simple awk program expands a table
into the factory methods and classes for the typed tree.
$ java -jar jag.jar < step10/type.jag > step10/type.java $ javac step10/type.java $ java step10.type let x=2 in if true then 1 else x
Let ArrayList Var String x Lit Integer 2 If Lit Boolean true Lit Integer 1 Deref String x
ILet ArrayList IVar String x IntegerT Integer 2 IIf BooleanT Boolean true IntegerT Integer 1 IDeref String x
rewrites constant expressions and inserts conversion nodes if
required, just as it was done in
$ java step10.type "high" < "low" || 1. + 2 == 3
OrIf Lt Lit String "high" Lit String "low" Eq Add Lit Double 1.0 Lit Integer 2 Lit Integer 3
BOrIf SLt StringT String "high" StringT String "low" DEq DAdd DoubleT Double 1.0 I2D IntegerT Integer 2 I2D IntegerT Integer 3
type.T is the base class for typed branch nodes. A
literal such as 3 is represented as an
Integer attached to an
latter is a branch node marking the literal as an integer value.
An operation such as addition is represented by a typed branch
node such as
type.DAdd which marks the operation as an
addition of floating point values with a floating point result.
type.DAdd is a subclass of
Therefore, the result of an operation is a floating point value
exactly if the branch node, e.g., a
object, is an instance of
There are no mixed mode operations. If necessary a typed branch
node such as
type.I2D is inserted to represent a conversion
from an integer to a double value.
mini.pj supports a block consisting of a sequence of expressions and selective execution of one of two expressions depending on a boolean value. Additionally, there are procedure calls as discussed below.
A block is just another expression, i.e., it cannot be empty
because it's value and type will be taken from the last expression
in the sequence. type.jag rewrites a block accordingly;
therefore, a class such as
type.IBegin will represent
a block where the last expression produces an integer value. An
ArrayList is silently removed.
$ java step10.type begin true; 1.0; "two"; 3 end
Begin ArrayList Lit Boolean true Lit Double 1.0 Lit String "two" Lit Integer 3
IBegin BooleanT Boolean true DoubleT Double 1.0 StringT String "two" IntegerT Integer 3
supports selective execution of one of two expressions depending
on a boolean value. A selection is just another expression; again,
it cannot be empty because it's value and type will be taken from
whatever expression is selected. type.jag rewrites a selection
accordingly; therefore, a class such as
will represent a selection where the selected expression produces
a floating point value.
$ java step10.type if true then 1.0 else 2.0
If Lit Boolean true Lit Double 1.0 Lit Double 2.0
DIf BooleanT Boolean true DoubleT Double 1.0 DoubleT Double 2.0
Strong typing requires that the result type of a selection be known, regardless of whether the then or else part is selected. Therefore, neither part can be empty and they must produce the same type. Of course, the condition must produce a bool result.
Type comparison is best implemented by delegation.
type.T.type() returns an object that is unique for a type.
A class representing literals such as
type.T.type() to return it's class
object, and all derived classes such as
inherit this behaviour, i.e., they will also return the class
type.IntegerT. Class objects are unique and
Object.equals(java.lang.Object) as an identity
comparison; therefore, comparing the objects returned by
a flexible solution.
supports let in the style of Scheme: a set of values is
bound to a set of names. The names are lexically scoped and can
be used in the dependent expression of the let. Duplicate
names are silently permitted; subsequent names supersede prior
names. The value and type of the let result from the
dependent expression, the value and type of each name result from
the expression bound to it. type.jag rewrites let
accordingly; therefore, a class such as
will represent a variable bound to an integer value and
type.SLet will represent a let where the dependent
expression produces a string value.
$ java step10.type let x="hello, "; y=3 in x + "world!"
Let ArrayList Var String x Lit String "hello, " Var String y Lit Integer 3 Add Deref String x Lit String "world!"
SLet ArrayList SVar String x StringT String "hello, " IVar String y IntegerT Integer 3 SAdd SDeref String x StringT String "world!"
An environment, i.e., a stack of hashtables, is used to model lexical scoping and map names to their types:
env, the environment upon entry to let, contains all names defined in a more global scope. This environment is used to check all initializer expressions and determine their types. Therefore, the initializers cannot use the variables that are defined in this let.
A new environment, created by pushing a hashtable with the names defined in the current let on top of env, is used to check the dependent expression of the let and propagate the type as the result type of the let.
The top hashtable maps the (unique) names to the types determined for the initializers. If a name is found in the dependent expression, the top hashtable is searched first, i.e., the names from the let will be found first. Once the dependent expression has been checked, the new environment, i.e., the top hashtable, is discarded, i.e., the names in the let have a scope which is restricted exactly to the dependent expression of the let.
Names can only be used if they are in scope, i.e., by inspecting
mini.yyTree.Deref can be rewritten
type.IDeref to indicate that a variable is bound
to an integer value.
supports letrec reminiscent of Scheme: a set of procedure
values is bound to a set of names. All names are lexically scoped.
All procedure names can be used in the dependent expression of
the letrec. All procedure names and the applicable
parameter names can be used in the body expression of a procedure.
Duplicate names are silently permitted; subsequent names supersede
prior names. The value and type of the letrec result
from the dependent expression, the type of each procedure name
results from the types specified in the procedure header, the
type of a parameter name is specified in the procedure header.
letrec accordingly; therefore, a class such as
type.BLetrec will represent a letrec where the
dependent expression produces a boolean value.
$ java step10.type letrec bool eq(int a, double b) = a == b in eq(1, 1.0)
Letrec ArrayList Proc String bool String eq ArrayList Param Type String int ArrayList String a Param Type String double ArrayList String b ArrayList Eq Deref String a Deref String b Call String eq ArrayList ArrayList Lit Integer 1 Lit Double 1.0
BLetrec ArrayList ProcT bool(int,double) String eq DEq I2D IDeref String a DDeref String b BCall String eq ArrayList ArrayList IntegerT Integer 1 DoubleT Double 1.0
Type checking mostly amounts to careful building of environments:
The dependent expression of the letrec has access to an environment extended with all procedure names and their types taken from the procedure headers. A procedure type consists of the result type and the list of parameter types (if any). The type of the dependent expression is propagated as type of the letrec.
The body of a procedure has access to the same environment, further extended with the procedure's own parameter names bound to their types as declared in the header. The type returned by the procedure body must be the type which was computed as the result type of the procedure from the procedure header.
A procedure call is checked as follows: The initial name must
be bound to a procedure type. The result type of this procedure
determines the result of the call. The list of parameter types
must match the list of argument types.
Procedure values can be used just like any other value. compose.txt demonstrates function composition and illustrates how procedure types are represented as types for parameters and how a procedure is returned as a result.
$ java step10.type < compose.txt
Letrec ArrayList Proc String double String compose ArrayList Param Type String double ArrayList TypeList ArrayList Type String int ArrayList String f Param Type String int ArrayList TypeList ArrayList Type String string ArrayList String g Param Type String string ArrayList TypeList null String h ArrayList TypeList ArrayList Type String string ArrayList Letrec ArrayList Proc String double String result ArrayList Param Type String string ArrayList String s ArrayList Call String f ArrayList ArrayList Call String g ArrayList ArrayList Add Call String h ArrayList null Deref String s Deref String result Deref String compose
PLetrec double(double(int),int(string),string())(string) ArrayList ProcT double(double(int),int(string),string())(string) String compose PLetrec double(string) ArrayList ProcT double(string) String result DCall String f ArrayList ArrayList ICall String g ArrayList ArrayList SAdd SCall String h ArrayList null SDeref String s PDeref double(string) String result PDeref double(double(int),int(string),string())(string) String compose
curry.txt shows a deeper nest of procedures returning procedures and how such a nest can be called. The rewritten tree demonstrates how the procedure types are successively unraveled.
$ java step10.type < curry.txt
Let ArrayList Var String curry Letrec ArrayList Proc String string String u ArrayList Param Type String int ArrayList String i ArrayList TypeList ArrayList Type String double ArrayList TypeList ArrayList Type String bool ArrayList TypeList null Letrec ArrayList Proc String string String v ArrayList Param Type String double ArrayList String d ArrayList TypeList ArrayList Type String bool ArrayList TypeList null Letrec ArrayList Proc String string String w ArrayList Param Type String bool ArrayList String b ArrayList TypeList null Letrec ArrayList Proc String string String x null ArrayList If AndIf Eq Deref String i Deref String d Deref String b Lit String "true" Lit String "false" Deref String x Deref String w Deref String v Deref String u Call String curry ArrayList ArrayList Lit Integer 1 ArrayList Lit Double 2.0 ArrayList Lit Boolean true null
SLet ArrayList PVar string(int)(double)(bool)() String curry PLetrec string(int)(double)(bool)() ArrayList ProcT string(int)(double)(bool)() String u PLetrec string(double)(bool)() ArrayList ProcT string(double)(bool)() String v PLetrec string(bool)() ArrayList ProcT string(bool)() String w PLetrec string() ArrayList ProcT string() String x SIf BAndIf DEq I2D IDeref String i DDeref String d BDeref String b StringT String "true" StringT String "false" PDeref string() String x PDeref string(bool)() String w PDeref string(double)(bool)() String v PDeref string(int)(double)(bool)() String u SCall String curry ArrayList ArrayList IntegerT Integer 1 ArrayList DoubleT Double 2.0 ArrayList BooleanT Boolean true null
type.IntegerT is the base class for all
operations with an integer result,
the base class for all operations with a procedure value as a
result, e.g., a
returning a procedure.
However, there is a big difference: integers are a built-in
type and the class object of
represents that type when type equality is checked by delegation
(see Selection). Procedure values, on
the other hand, belong to user-defined types and objects have to
be created to represent each such type. Procedure types are
equal if they have equal return and
type.ProcT is also used to represent a procedure
type for type equality checking. When a
is created from a procedure definition to rewrite
mini.yyTree.Proc it delegates to itself. When a branch
node such as
type.PLet (which returns a procedure
value) is created using
create() the delegate is propagated.
If let binds a procedure value to a name,
mini.yyTree.Var is rewritten as
the delegate is propagated, and the delegate (which will delegate
to itself) is bound as the type of the name in the current
Type checking ensures that operations will only be applied to values for which they are intended.
Type checking can be accomplished by rewriting the program tree with branch nodes that indicate the types of the values which they will produce during execution; i.e., type checking amounts to a symbolic execution of the program with types playing the role of values while recording all results. Apart from processing declarations, rewriting proceeds in postorder and descends all subtrees, even in selections.
Each built-in type can be represented by the class object of a base class for all branch nodes returning that type.
User-defined types require a new object for each new type. Type comparison in general is delegated to these objects. The delegates have to be available in all branch nodes. The classes of these delegates can be used to establish categories of user-defined types, e.g., procedures can be distinguished from arrays.
|PREV PACKAGE NEXT PACKAGE||FRAMES NO FRAMES|