|
Language Processing v2.0 |
|||||||||
| PREV NEXT | FRAMES NO FRAMES | |||||||||
See:
Description
| Tutorial | |
|---|---|
| step01 | 1: How to structure a compiler. |
| step02 | 2: How to define languages. |
| step03 | Step 2: How to check grammars. |
| step04 | Step 3: How to recognize symbols. |
| step05 | Step 4: How to recognize phrases. |
| step06 | Step 5: How to build trees. |
| step07 | Step 6: How to interpret programs. |
| step08 | Step 7: How to check types. |
| Tools | |
|---|---|
| pj2 | Compiler frontend generator. |
| vig | Visitor generator. |
| wcs | Web compiler service. |
| xv | XML Visitor generator. |
Language Processing is an interactive tutorial about compiler construction and describes tools and techniques which facilitate the object-oriented implementation of language processors.
This introduction briefly discusses the structure of a compiler and what role programming tools play in
the construction of a compiler. Step 1 describes the implementation of an adding machine as
a rudimentary example for hand-coding a compiler. The remaining steps look at the tools and programming
techniques in more detail. The goal of the tutorial is to enable the reader to use professional compiler
tools to at least write interpreters for small languages.
The examples in this tutorial can be compiled using the web compiler service and executed
as applets in a Java-enabled browser. Alternatively, they can be downloaded from here. For local compilation the tools need to be downloaded from their
individual homepages and installed as siblings of the tutorial steps.
Here is an overview of what goes on inside a compiler:
In several steps input characters are partitioned into a sequence of tokens and a symbol table and then into a tree representing the program expressed by the input characters. The tree can then be converted into something that is more or less executable and will produce the effect expected of the program.
The steps of the compilation process run more or less in parallel, i.e., the intermediate data may or may not be available explicitly. How the compilation progresses depends on the implemented language, e.g., whether declare before use is required (as in C) or not (as in Java, for the most part).
The role of a frontend generator
A compiler consists of a frontend and a backend. The frontend is responsible for analysis, i.e., it must decide if the input characters express a legitimate program. If so, the backend is responsible for synthesis, i.e., it must represent the program in some executable or interpretable format.
Frontend generators such as pj2 or oops3 help in implementing input partitioning
(lexical analysis), checking program structure (syntax analysis), and
representing the program as a tree, all just based on an annotated grammar of the programming language:
If requested, pj2 or oops3 creates input for and executes JLex, a Java program which generates a
scanner (class for lexical analysis) from a table of regular expressions and actions written
in Java. The actions discover tokens within input characters and maintain information in a Map-based datastructure.
pj2 creates input for and executes jay, Berkeley yacc retargeted to Java, a
program which generates a parser (class for syntax analysis) from a table of grammar rules
and actions written in Java. pj2 creates an interface describing the actions. The actions are
executed once the tokens from the scanner match rules of the grammar.
The actions often represent a user program as a parse tree for processing by the
compiler backend. If requested, pj2 implements the action interface with a factory and List-based classes to represent the parse tree and provide a structured dump.
oops3 represents grammar rules as a tree and provides visitors for the tree which implement the
same functionality as pj2. Either tool can implement test scaffolding for the generated scanner,
parser, and tree builder. Many other frontend generators exist; wcs can be used to experiment with
some of them in a web browser.
The role of a visitor generator
A compiler consists of a frontend and a backend. Usually, the frontend analyzes the source program and creates an intermediate representation. The backend processes the intermediate representation, e.g., to perform further analysis, determine runtime memory allocation, generate machine code, or interpret the program.
Frontend generators such as pj2 and oops3 facilitate writing frontends which represent
a source program as a tree using List-based classes. A backend then has to traverse such
a tree and make decisions based on the branch node classes and leaf values. The backend if often
implemented using the visitor pattern; however, the visitor pattern tends to be cumbersome to extend once
more classes are added to the tree representation.
A visitor generator such as vig facilitates implementing a backend. vig is a
preprocessor with a pattern/action paradigm; actions are programmed in Java. In the actions vig
expands shorthand notations for printing and for recursive calls on the pattern selection mechanism.
A pattern investigates the classes of a branch node and its descendants; inheritance is used as a wildcard mechanism that encourages a programming style where patterns for more specific classes override patterns for more general classes. A tree traversal can first be implemented with a few general patterns and later improved by adding patterns for special cases.
|
(c) 2008 Axel T. Schreiner |
|||||||||
| PREV NEXT | FRAMES NO FRAMES | |||||||||