Compiler Design
Language Processors
Introduction
Terms
- black box
- A system characterized only by its external interface behavior.
Translators and Compilers
Terms
- translator
- A program that accepts text expressed in one language and generates semantically equivalent text expressed in another language.
- source language
- The input language of a translator.
- target language
- The output language of a translator.
- assembler
- A translator from an assembly language to the corresponding machine language.
- compiler
- A translator from a high level language to a low level language.
- high-level translator
- A translator from one high-level language to another.
- disassembler
- A translator from machine language to assembler language.
- decompiler
- A translater from a low level language to a high level language.
- source program
- The input text of an assembler or compiler.
- object program
- The output text of an assembler or compiler.
- implementation language
- The language in which a program is expressed.
- tombstone diagram
- A graphical representation of the overall function of a system.
- cross compiler
- A compiler which generates code for a machine different from the machine on which it is run.
Procedure
Interpreters
Terms
- interpreter
- A program expressed in one language which executes programs expressed in another language.
Real and Abstract Machines
Terms
Interpretive Compilers
Terms
- Interpretive compiler
- A program which combines a compiler that produces object code in an intermediate language with an interpreter for that intermediate language.
Bootstrapping
Remarks
- Problem: Write and compile a compiler for a new language in the new language itself
- Resources: A machine M and an existing compiler
for an existing language C which run on machine M.
- Solution: Use the full bootstrap strategy
- Step 1: Write a compiler for a subset of the new language in an
existing language, for which a compiler is already available.
- Step 2: Compile the subset compiler, using the existing compiler.
- Step 3: Hand translate the subset compiler into the subset language.
- Step 4: Compile the subset language compiler (written in the subset language),
using the compiler from step 2.
- Step 5: Extend the subset language compiler from step 3 into a compiler for the
full language, still using only the subset language.
- Step 6: Compile the full language compiler (written in the subset language),
using the compiler from step 4.
- The result is a compiler for the new language which is
written a subset of the language itself.
- Problem: Port an existing compiler from one machine to another.
- Resources: A machine M1 and an existing compiler
for an existing language X which run on machine M1.
- Solution: Use the half bootstrap strategy
- Step 1: Rewrite the code generation part of the existing compiler
to produce code for the new machine.
- Step 2: Compile the modified compiler on the first machine.
- The result is a cross compiler, running on machine
M1 and producing code for machine
M2.
- The cross compiler can be tested by transferring compiled
programs to the new machine.
- The cross compiler can then be used to cross compile
itself, producing a compiler which can be transferred
to the new machine.
- Problem: To improve an existing compiler.
- Resources: Source and executable code for the existing compiler.
- Goal: Source and executable code for an improved compiler.
- Step 1: Rewrite the existing compiler
to produce improved code.
- Step 2: Compile the modified compiler with the existing compiler.
- Step 3: Test the improved compiler with existing programs.
- Step 4: Compile the improved compiler with itself.
- The goal has now been met.
Case Study: The Triangle Language Processor
Remarks
- The triangle system includes a compiler (Compiler),
a virtual or abstract machine (TAM),
and a disassembler (Disassembler).
- Each of the three parts of the triangle system are
compiled by the java compiler (javac).
- Triangle programs are compiled by Compiler.class,
which runs on a java virtual machine (JVM).
- Triangle programs are run (interpreted) on a triangle abstract machine
(TAM), which runs (is interpreted) on a java virtual machine (JVM).