Compilation Pipeline Improvements (edited) #25

gnumonik · 2024-03-22T18:11:56Z

Context / Motivation

NOTE: I'm editing an old issue instead of making a new one.

In my rush to get the full pipeline working ASAP, I ended up doing a few things that, in hindsight, should have been done differently:

Bound is great but we convert to a Bound AST at the wrong place in the pipeline. As things are, we convert to a Bound representation after inlining and monomorphization, thereby losing the benefits Bound provides (i.e. perfectly safe capture-avoiding substitution). Since these benefits would be particularly helpful during monomorphization and inlining, we should convert to Bound as soon as humanly possible.
Apps in our CoreFn do not need to be explicitly typed, and keeping the types correct in them is a huge headache with no real benefits. We should just remove those annotations and calculate the type from the "function part" and the "argument part".
Case expression desugaring is a nightmare ATM because we go from CoreFn style case analysis to PIR style case analysis at the same time we do the final conversion to PIR. It probably isn't obvious what I mean, so let me explain. This is (more or less) what the CoreFn representation of a case alternative is:
- (case (constr 1 x y) (alt x t1) (alt x y t2))
- This is the PIR representation: (case (constr 1 a b) (\x -> t1) (\x y -> t2))
- The main issue here is that we have to know the types of the variables bound by each case alternative branch before we can construct the branch in PIR, else we can't make the types line up. Determining that type ahead of time would greatly simplify the final conversion process, and we should do it.

The Ask

Need to explain how things currently work real quick. The rough pipeline of transformations is:

1. PureScript Parser 
2. CST 
3. AST 
4. PS TypeChecker 
5. PS Desugarer 
6. Convert to Typed CoreFn (w/ type deduction) 
7. Monomorphizer/Inliner
8. Convert to Bound AST / Desugar Objects
9. Convert to PIR
10. Compile to UPLC

I'm proposing that after step 6, we convert to another Bound AST representation which is (effectively) isomorphic to Typed CoreFn. That way, we can do safe and performant substitutions in the monomorphizer/inliner.

We'll still have the Bound AST in step 8 (which gives us a "Parse-don't-validate" approach to ensuring that expressions and types that cannot be converted to PLC are desugared and removed).

What I'm proposing isn't exactly trivial. There is a lot of boilerplate required for Bound to work. For a given Expr ADT, each component type which is not ~ Expr requires several somewhat complicated type class instances that cannot be derived (well, maybe I can fix the TH in deriving-compat to work for them, but cannot be derived easily at least).

But it will give us immense benefits both in safety (since it becomes impossible to screw up capture avoiding subsitution) and performance.

Acceptance Criteria

How can we decide if the task is complete?

The new AST type is implemented and existing utilities that operate on CoreFn are modified to operate on BoundCoreFn.

The text was updated successfully, but these errors were encountered:

gnumonik changed the title ~~Convert to 'Bound' representation earlier in the compilation pipeline~~ Compilation Pipeline Improvements (edited) Mar 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compilation Pipeline Improvements (edited) #25

Compilation Pipeline Improvements (edited) #25

gnumonik commented Mar 22, 2024 •

edited

Loading

Compilation Pipeline Improvements (edited) #25

Compilation Pipeline Improvements (edited) #25

Comments

gnumonik commented Mar 22, 2024 • edited Loading

Context / Motivation

The Ask

Acceptance Criteria

gnumonik commented Mar 22, 2024 •

edited

Loading