The purpose of this programming project is to implement an interpreter, a
typechecker, and a compiler (down to a simple abstract machine) for a small
functional programming language equipped with (possibly parameterized)
algebraic data types. The following parts of the program are
provided: a lexer
and parser, a constraint solver for first-order unification constraints, an
abstract machine, and its execution engine.
The project can be implemented in any language of your choice, but we strongly
recommend using Caml, as the sources we provide are written in Caml.
Any version ≥ 3.0 should do, but in doubt
install version 3.09.3 from http://caml.inria.fr/ocaml/release.en.html
or from the packages available in your Linux distribution.
The Menhir parser generator
Available at
http://gallium.inria.fr/~fpottier/menhir/. This tool
is required in order to produce parser.mli and parser.ml
out of parser.mly. (For those who don't want to install Menhir,
we do provide parser.mli and parser.ml, but you will
need to modify the Makefile in order to let make know
that these files are not generated and should not be destroyed.)
Linux, FreeBSD, MacOSX, or some other Unix-like system
The Makefile that we distribute has not been tested under Microsoft Windows. You
are on your own if you insist on using Windows.
In the src/ directory, you will find the following files:
abstractSyntax.mli
Defines the abstract syntax for the
language.
type.{ml, mli}
A small number of utility functions over the abstract syntax
of types.
parser.mly, lexer.mll, error.{ml, mli}
Parsing and error reporting. Together, the lexer and parser
define the concrete syntax for the language.
stringMap.{ml, mli}
Maps whose keys are strings. Useful for implementing various
kinds of environments.
option.{ml, mli}
Various utility functions for values of type 'a option.
print.{ml, mli}
Various utility functions for pretty-printing.
wf.{ml, mli}
Check that a program is well-formed (no unbound variables, etc.).
unionFind.{ml, mli}
Implements Tarjan's data structure for the union-find problem.
This module underlies our implementation of first-order unification.
unification.{ml, mli}
Implements first-order unification. This module defines the syntax
of unification problems, which the constraint generator must produce.
generator.{ml, mli}
Specifies the constraint generator. The implementation is missing;
to be completed in task 3.
interpreter.{ml, mli}
The skeleton of the interpreter. To be completed in task 1.
machine.{ml, mli}
Definition of the abstract machine: instruction set and execution
engine for abstract machine code.
compiler.{ml, mli}
The skeleton of the compiler. To be completed in task 2.
settings.{ml, mli}
Parses the command line.
front.{ml, mli}
The top-level file of the program. Calls and combines the parser,
the type-checker, the interpreter, the compiler and the execution
engine of the abstract machine.
Implement an interpreter for the source
language. The file to modify is interpreter.ml.
Task 2
Implement a compiler from the source language to
the abstract machine. The file to modify is compiler.ml.
Task 3
Study the specification that the constraint generator must meet,
which is found in generator.mli, and implement the generator. The
file to modify is generator.ml.
At the moment, the generator is incomplete and always produces an empty
unification problem, which means that the inferred type is always
“∀α.α”. It is up to you to construct a unification problem
that is necessary and sufficient for the code to be well-typed.
Note that, for simplicity, we only implement simple (that is,
monomorphic) type inference: no generalization will be performed at
let constructs. Only the data constructors can have (closed)
polymorphic type schemes, which are given by the dcenv parameter.
In order to better understand the entire type inference process, it is
recommended, although not strictly necessary, to have a look at the modules
UnionFind and Unification, which perform constraint solving.
For extra credit
Extend the program in any direction
you're interested in: additional language features, polymorphic type
inference, understandable type error messages, compiler optimizations, ...
Additional test files written in the small programming language,
if you wrote any.
If you implemented “extra credit” features, a README
file (written in French or English) describing these additional
features, how you implemented them, and where we should look in the
source code to see how they are implemented.