WARNING: incomplete project ahead
This implementation was inspired by and expands on Jon Sterling's video lecture: How to code your own type theory
The focus of this program is to elaborate a surface
language directly emitted by the parser down to a type-correct representation called syntax
. On the way we also use a semantic
domain - which is invariantly β-reduced - in the style of normalization-by-evaluation.
- Full spectrum dependent type theory featuring Π and Σ type formers
- Normalization by Evaluation w/ DeBruijn levels & indices
- Type directed conversion checking
- Glued evaluation
- Bidirectional type checking
- Parsing and pretty printing
- Boolean literals
- Natural numbers
- Named tuples
- REPL & File input
Surface expressions are the direct output from the parser. These use user-friendly explicit naming and distinguish top-level definitions from local let bindings.
Syntactic terms use DeBruijn indices (counting the environment from the right) for easy α-equivalence checking.
Semantic values use DeBruijn levels (counting the environment from the left) which allows us to freely move expressions deeper into a nested context (aka weakening). This is essential for efficiently β-reducing lambda abstractions, substituting let bindings, etc during evaluation.
Values have a distinct subdomain for neutrals, which are those values that can't be β-reduced further due to being "stuck" on a variable. These neutrals are represented in spine form for easy access to the head - the variable it's stuck on.
During conversion checking we don't necessarily want to unfold all definitions, because that might lead to a significant blow-up in value size. Likewise, when printing values to the user we want them to be as small as possible, for better readability. On the other hand, we still want to be able to unfold all definitions on demand when the need arises. For this we keep a "glued" lazily-evaluated version of the value in which certain definitions can be unfolded.
During type checking we often want to check whether two terms are αβη-convertible to each other. α equivalence is already covered by indices. To handle β reductions we choose a representative for each equivalence class, which is fully β-reduced. This representative is a semantic value, and the process of finding the representative of a term is called evaluation. We also have a reverse operation called quoting to go back from the semantic domain to terms.
The last missing piece is η-equivalence. Values are already in η-short form, meaning all the possible η-reductions were applied, but two values might be convertible by applying an η-expansion on a neutral value. Conversion checking takes care of this. Performing η-expansion relies on the type of the values being compared, thus we require type-directed conversion checking. This also means we need to carry around some type info within neutrals.
A typical bidirectional type checker will revolve around the two functions
infer : ctx → expr → Maybe typ
check : ctx → expr → typ → Maybe ()
infer
tries to deduce the type of an expression under a given context, while check
just attempts to confirm the expression has a given type.
The checker here is actually an elaborator, so it spits out a new "type-checked" version of the input expression:
infer : ctx → expr → Maybe (term, typ)
check : ctx → expr → typ → Maybe term
In our specific case the elaborated output won't be significantly different from the input, but in more advanced implemetation it is essential to allow modifying the input terms for subsequent parts of the compiler.