You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
NOTE: I'm editing an old issue instead of making a new one.
In my rush to get the full pipeline working ASAP, I ended up doing a few things that, in hindsight, should have been done differently:
Bound is great but we convert to a Bound AST at the wrong place in the pipeline. As things are, we convert to a Bound representation after inlining and monomorphization, thereby losing the benefits Bound provides (i.e. perfectly safe capture-avoiding substitution). Since these benefits would be particularly helpful during monomorphization and inlining, we should convert to Bound as soon as humanly possible.
Apps in our CoreFn do not need to be explicitly typed, and keeping the types correct in them is a huge headache with no real benefits. We should just remove those annotations and calculate the type from the "function part" and the "argument part".
Case expression desugaring is a nightmare ATM because we go from CoreFn style case analysis to PIR style case analysis at the same time we do the final conversion to PIR. It probably isn't obvious what I mean, so let me explain. This is (more or less) what the CoreFn representation of a case alternative is:
(case (constr 1 x y) (alt x t1) (alt x y t2))
This is the PIR representation: (case (constr 1 a b) (\x -> t1) (\x y -> t2))
The main issue here is that we have to know the types of the variables bound by each case alternative branch before we can construct the branch in PIR, else we can't make the types line up. Determining that type ahead of time would greatly simplify the final conversion process, and we should do it.
The Ask
Need to explain how things currently work real quick. The rough pipeline of transformations is:
1. PureScript Parser
2. CST
3. AST
4. PS TypeChecker
5. PS Desugarer
6. Convert to Typed CoreFn (w/ type deduction)
7. Monomorphizer/Inliner
8. Convert to Bound AST / Desugar Objects
9. Convert to PIR
10. Compile to UPLC
I'm proposing that after step 6, we convert to another Bound AST representation which is (effectively) isomorphic to Typed CoreFn. That way, we can do safe and performant substitutions in the monomorphizer/inliner.
We'll still have the Bound AST in step 8 (which gives us a "Parse-don't-validate" approach to ensuring that expressions and types that cannot be converted to PLC are desugared and removed).
What I'm proposing isn't exactly trivial. There is a lot of boilerplate required for Bound to work. For a given Expr ADT, each component type which is not ~ Expr requires several somewhat complicated type class instances that cannot be derived (well, maybe I can fix the TH in deriving-compat to work for them, but cannot be derived easily at least).
But it will give us immense benefits both in safety (since it becomes impossible to screw up capture avoiding subsitution) and performance.
Acceptance Criteria
How can we decide if the task is complete?
The new AST type is implemented and existing utilities that operate on CoreFn are modified to operate on BoundCoreFn.
The text was updated successfully, but these errors were encountered:
gnumonik
changed the title
Convert to 'Bound' representation earlier in the compilation pipeline
Compilation Pipeline Improvements (edited)
Mar 26, 2024
Context / Motivation
NOTE: I'm editing an old issue instead of making a new one.
In my rush to get the full pipeline working ASAP, I ended up doing a few things that, in hindsight, should have been done differently:
Bound
is great but we convert to a Bound AST at the wrong place in the pipeline. As things are, we convert to aBound
representation after inlining and monomorphization, thereby losing the benefitsBound
provides (i.e. perfectly safe capture-avoiding substitution). Since these benefits would be particularly helpful during monomorphization and inlining, we should convert toBound
as soon as humanly possible.App
s in ourCoreFn
do not need to be explicitly typed, and keeping the types correct in them is a huge headache with no real benefits. We should just remove those annotations and calculate the type from the "function part" and the "argument part".CoreFn
style case analysis toPIR
style case analysis at the same time we do the final conversion to PIR. It probably isn't obvious what I mean, so let me explain. This is (more or less) what the CoreFn representation of a case alternative is:(case (constr 1 x y) (alt x t1) (alt x y t2))
(case (constr 1 a b) (\x -> t1) (\x y -> t2))
The Ask
Need to explain how things currently work real quick. The rough pipeline of transformations is:
I'm proposing that after step 6, we convert to another
Bound
AST representation which is (effectively) isomorphic to Typed CoreFn. That way, we can do safe and performant substitutions in the monomorphizer/inliner.We'll still have the
Bound
AST in step 8 (which gives us a "Parse-don't-validate" approach to ensuring that expressions and types that cannot be converted to PLC are desugared and removed).What I'm proposing isn't exactly trivial. There is a lot of boilerplate required for
Bound
to work. For a givenExpr
ADT, each component type which is not~ Expr
requires several somewhat complicated type class instances that cannot be derived (well, maybe I can fix the TH inderiving-compat
to work for them, but cannot be derived easily at least).But it will give us immense benefits both in safety (since it becomes impossible to screw up capture avoiding subsitution) and performance.
Acceptance Criteria
How can we decide if the task is complete?
The new AST type is implemented and existing utilities that operate on CoreFn are modified to operate on BoundCoreFn.
The text was updated successfully, but these errors were encountered: