Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: implement inline assembly #21

Merged
merged 15 commits into from
Sep 14, 2023
Merged

feat: implement inline assembly #21

merged 15 commits into from
Sep 14, 2023

Conversation

bitwalker
Copy link
Contributor

This PR expands upon the drafted inline assembly functionality, and in the process adds some useful pieces of functionality that are used during code generation. I've inlined the main commit message below which goes into more detail.

There are two motivations for this currently:

  1. The code generator work I'm introducing builds on some of this functionality, e.g. the representation of local variables, the MASM instruction set, and the operand stack data structure.
  2. Some higher-level instructions in the IR require more complex sequences of Miden Assembly - to facilitate that, we will define a baked-in intrinsics module that exposes procedures written using inline assembly that provide the functionality needed for those instructions. This keeps the resulting MASM clean, while still allowing us to keep those instructions as efficient as possible.

@jjcnn You may notice the changes to the textual output of the IR with regard to inline assembly. You can feel free to omit support for inline assembly entirely from the parser for the time being, as I don't think it is essential at all to support it there for now, however I did provide a short example of what you can expect the syntax to look like in practice, below. Adding support for inline assembly to the parser will be more effort than its worth for now though IMO.


This commit fleshes out the implementation of inline assembly that we
had essentially stubbed out prior to this. The following are provided as
part of this implementation:

  • A representation of the Miden Assembly instruction set, in crate,
    with the ops we intend to support in the near term. This is used to
    represent the contents of inline assembly, and is also used in the
    upcoming codegen backend as the target IR for Miden Assembly, as it
    is a flatter representation similar to Miden IR. The intent is to also
    implement a tiny evaluator for this subset of Miden Assembly for use
    in tests without having to emit MASM files, and run the Miden VM which
    is much more expensive. It also allows us to track the subset of Miden
    Assembly which we support in the compiler, as well as represent
    experimental changes that have not yet been implemented upstream.

  • MasmBuilder, which is constructed from an InstBuilder via inline_asm,
    and plays a similar role to FunctionBuilder+InstBuilder, but in
    terms of Miden Assembly, in the context of constructing a single inline
    assembly block.

  • Improved pretty printing of inline assembly. Now, an inline assembly
    block is printed like so (this block adds the two inputs, and squares
    the result, which is returned as the result of the inline assembly
    block):

    asm (v1, v2) {
        add
        mul.2
    } : felt
    
  • A generic data structure for use in representing the operand stack,
    and the common operations which are performed on it. This is used to
    validate inline assembly, as well as keep track of values during code
    generation. It will also likely be used for whatever evaluator we
    throw together for use in testing.

This commit introduces a new data type to represent local variable
declarations. These are not currently attached to anything, but are used
downstream during code generation. The local identifiers are also used
in the in-crate representation of Miden Assembly, coming in a follow-on
commit.
This commit fleshes out the implementation of inline assembly that we
had essentially stubbed out prior to this. The following are provided as
part of this implementation:

* A representation of the Miden Assembly instruction set, in crate,
  with the ops we intend to support in the near term. This is used to
  represent the contents of inline assembly, and is also used in the
  upcoming codegen backend as the target IR for Miden Assembly, as it
  is a flatter representation similar to Miden IR. The intent is to also
  implement a tiny evaluator for this subset of Miden Assembly for use
  in tests without having to emit MASM files, and run the Miden VM which
  is much more expensive. It also allows us to track the subset of Miden
  Assembly which we support in the compiler, as well as represent
  experimental changes that have not yet been implemented upstream.
* `MasmBuilder`, which is constructed from an `InstBuilder` via `inline_asm`,
  and plays a similar role to `FunctionBuilder`+`InstBuilder`, but in
  terms of Miden Assembly, in the context of constructing a single inline
  assembly block.
* Improved pretty printing of inline assembly. Now, an inline assembly
  block is printed like so (this block adds the two inputs, and squares
  the result, which is returned as the result of the inline assembly
  block):

  ```hir
  asm (v1, v2) {
      add
      mul.2
  } : felt
  ```
* A generic data structure for use in representing the operand stack,
  and the common operations which are performed on it. This is used to
  validate inline assembly, as well as keep track of values during code
  generation. It will also likely be used for whatever evaluator we
  throw together for use in testing.
@bitwalker bitwalker added this to the Compiler milestone Sep 7, 2023
@bitwalker bitwalker requested a review from jjcnn September 7, 2023 21:59
@bitwalker bitwalker self-assigned this Sep 7, 2023
Copy link
Contributor

@jjcnn jjcnn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR ought to be split into at several smaller PRs. At least the following spring to mind:

  • One that introduces locals.
  • One that introduces Masm operations and their builders.
  • One that introduces the stack simulator/Masm evaluator.
  • Possibly a separate one for the actual inline assembly blocks.


use super::Type;

/// A strongly typed identifier for referencing locals associated with a function
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code generator work I'm introducing builds on some of this functionality, e.g. the representation of local variables, ...

What's the intended relationship between Local/LocalId and Value? For instance, how do you obtain the Value that a Local represents when a function is called?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Locals are implicitly created by use of the alloca instruction, which returns the address of a Local as a Value, which can then be used with the load and store instructions to read/write it. There is no way to directly create a Local, aside from using alloca. We don't actually reify a Local until code generation though, since it isn't necessary to represent in HIR, however they have a more important role in MASM.

hir/src/asm.rs Outdated
///
/// The inline assembly will be validated so that all other values on
/// the operand stack below the given arguments will remain on the stack
/// when the inline assembly finishes executing.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This description, as well as the proposed formatting

asm (v1, v2) {
    add
    mul.2
} : felt

suggests that an inline assembly block consumes a fixed number of stack elements (2 in the example) and always produces a single stack element.

I'm fine with the fixed number of inputs, but can we assume that there will always be a single result element? I would have assumed that not producing a result would be a valid behavior, and producing multiple result elements also seems ok.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just noticed that the behavior I describe is indeed allowed, but it's only described in the comments for the builder functions.

I think it would be a good idea to flesh out these kinds of assumptions either in the comments of InlineAsm, or in separate documentation. People will be looking for those descriptions there, not in the builder function comments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inline assembly blocks can support zero or more results, much like function calls. The output of an inline assembly block is given by the result type, which in the text format starts after the trailing :. For now though, that's specified using a single Type via the builders, so representing zero, one, or more results requires using an appropriate Type, e.g. Type::Unit, Type::Felt, or Type::Array(Box::new(Type::Felt), n) for 0, 1, or n-ary results.

I think it would be a good idea to flesh out these kinds of assumptions either in the comments of InlineAsm, or in separate documentation. People will be looking for those descriptions there, not in the builder function comments.

Could you clarify what assumptions you are referring to here? InlineAsm doesn't impose any restrictions on the number or type of results on its own, and what restrictions exist are imposed by the builders (either the inline_asm of InstBuilder, or by MasmBuilder), and their docs reflect those restrictions (unless I missed one somewhere). In general, it is expected that users of the IR (i.e. compiler frontends) are going to work through the builders, and not need to be aware of InlineAsm unless they are doing something low-level with the IR, I would be very surprised if the first thing someone looked at was that struct, vs the docs associated with the builders.

I'm hesitant to go too far down the path of separate high-level documentation at this stage, there is still a lot of instability in the IR that remains to be ironed out, and high-level docs are particularly prone to falling out of date, and can be difficult to maintain under a stream of constant change. I'm as keen to get those kind of docs in place as you are, but I want to focus on stabilizing the foundation first, then start building stuff on it.

That said, if you find that there are aspects of the implementation that are undocumented, I would like to make sure that is addressed. I will expand on the InlineAsm docs a bit to provide at least a little more overview on it's structure and usage.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not talking about fully fledged documentation at this stage. What I'm talking about at this stage is much more low-level, and could be as simple as an unsorted list of design decisions that have been made along the way, and which could eventually be collated into actual user-facing documentation.

The alternative is to remember everything in our heads, and in my experience that always leads to problems later on.

hir/src/asm.rs Outdated Show resolved Hide resolved
hir/src/asm.rs Outdated Show resolved Hide resolved
hir/src/asm.rs Outdated
///
/// Traps if the conditional is not 0 or 1.
pub fn cswap(self) {
self.stack.drop();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get that this keeps track of the stack height, but if the stack elements are Type objects, then how do you keep track of the elements when the swap is conditional?

AFAICT there is no way to keep track of that, so it should be made clear exactly what information we are keeping track of (is it just the stack height? If so, why are the stack elements of type Type?).

The same question applies to all the conditional ops.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is documented here (and the default implementation) is what happens to the operand stack in the VM when that operation is performed. When using a type stack, we're tracking that behavior in the abstract, and we are only primarily concerned with the stack height. I should note that we are only ever pushing Type::Felt in this case, so it doesn't actually matter what the conditional or the operands are, as it doesn't change the semantics in the slightest.

If we were to expand the use of type stacks to something more general purpose, then we'd likely override the implementation of certain functions in the trait, to tailor them to the semantics of a type checker (i.e. validating that the operands of an operation like cswap are the same). The default implementation simply performs the necessary stack maintenance.

hir/src/asm.rs Outdated
self.stack.dropn(8);
self.asm.push(self.ip, MasmOp::AssertEq);
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There doesn't seem to be a builder function for the LocAddr operation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's intentional, I don't intend to support accessing locals from inline assembly at this time. If someone wants this, they can use alloca to allocate a local in the IR, and then pass the address as an argument to the inline assembly block.

hir/src/asm.rs Outdated Show resolved Hide resolved
hir/src/asm.rs Outdated Show resolved Hide resolved
hir/src/asm.rs Outdated Show resolved Hide resolved
hir/src/asm.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@jjcnn jjcnn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I accidentally pressed 'Submit review' before I had finished my review of the stack simulator, so there's one more review comment.

hir/src/asm.rs Outdated Show resolved Hide resolved
@bitwalker
Copy link
Contributor Author

Thanks for the quick turnaround! I'm going through your notes now, and should have this turned back around shortly.

The [TypeRepr] enum is intended to represent how a specific [Type] will
be encoded in terms of field elements/words on the operand stack in
Miden. See the doc comments for details.

This is meant to provide a single source for information regarding type
layout in Miden. For types in linear memory, the encoding is slightly
different, as we must preserve the semantics of byte-addressability in
the IR. However, when placing values on the operand stack, the encoding
is modified to provide a more natural alignment between individual terms
and elements on the stack.

The specific representations in this commit are just an initial draft
based on details I'm aware of at this time, but we may make adjustments
as necessary.
The previous implementation did not provide us with an opportunity to
validate the state of the operand stack after the control flow
instruction is constructed. It also failed to account for changes to the
operand stack within the control flow instruction body.

This commit introduces a new implementation, one for if.true, and one
for the loop instructions, which provides a builder API for those
instructions. These builders ensure that upon construction, the operand
stack is consistent across all control flow edges leaving the
instruction, as well as checking other important invariants, such as
while.true requiring a boolean on top of the stack when entering/exiting.
@bitwalker
Copy link
Contributor Author

@jjcnn For Monday, I've updated the PR with a number of changes in response to your feedback. Aside from splitting up the asm module as requested, the most significant is related to the handling of the exec, syscall, if.true, while.true, and repeat builders. Most of the rest are small tweaks, doc additions/improvements. I've also added a test which demonstrates constructing a function with inline assembly that includes a loop; both to validate the API, and to test that it does what is expected. Each change is split into its own commit to make viewing them atomically easier.

I think it's worth putting a little additional time into polishing the handling of types in the MasmOpBuilder so that certain operations preserve type information in some edge cases (i.e. if you use add_u32 on a Type::Ptr value, it turns it into a Type::U32, that could potentially cause the operand stack consistency check to fail in some situations). I'd also like to ensure that cswap and cdrop operations require their operand types to be the same for a similar reason. I'll probably do that Monday, but that won't really change much in this PR.

Let me know if you have any further questions/feedback!

/// 1. There is a value of boolean type on top of the operand stack
/// 2. The abstract state of the operand stack, assuming the boolean just mentioned
/// has been popped, must be consistent with the state of the operand stack when the
/// loop was entered, as well as if the loop was skipped due to the conditional being
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit suggestion: "loop was entered, ..." -> "loop body is entered, ...".

/// Begins construction of a `repeat` loop, with an iteration count of `n`.
///
/// A `repeat` instruction requires no operands on the stack, and will execute the loop body `n` times.
///
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested addition: "The abstract state of the operand stack must be consistent when the loop body is entered, as well as when the loop is completed."

///
/// Upon finalization of the loop instruction, this state is used to validate
/// the effect of the loop body on the operand stack. For `repeat`, which is
/// unconditionally entered, no special validation is performed. However, for
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't repeat loops be validated as well? The stack could grow with each iteration, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

repeat.n can't produce dynamic (i.e. not statically analyzable) changes to the operand stack, it is statically expanded to n copies of the repeat body. In other words, if you make a modification to the stack, we know exactly how many times that modification will occur, so we can model the state of the stack across every iteration, and afterwards.

That said, I agree that some additional validation is needed here:

  • We want to ensure that, given S elements on the operand stack upon entry to the loop, that the top S elements on the stack along the loopback edge match those that were on the stack on entry, for all n iterations.
  • We want to ensure that if each iteration removes elements from the stack, that all n iterations can be performed without performing an invalid stack operation.

I'll do that before we merge this.

Copy link
Contributor

@jjcnn jjcnn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments which probably won't require code changes, a few nits, and an issue regarding validation of repeat loops which looks like it needs fixing. Other than that it looks fine.

Copy link
Contributor

@jjcnn jjcnn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving under the assumption that the repeat issue is fixed.

This implements a function on `Type` called `is_compatible_operand`,
whose purpose is to determine if a type is compatible with another
type in the context of a binary operator. Type compatibility is
defined here in terms of what types can be implicitly coerced to
the controlling type of the operator safely.

For example, one can safely apply `u32.add` to `u32` and `u8`,
where `u32` is the operand whose type "controls" the expected
type produced by the operation. Similarly, applying `u32.add`
to `u8` and `u8` is safe, albeit with no guarantees about catching
overflow of the `u8` range, both operands are `u32` compatible. On the
other hand, applying `u32.add` to `u32` and `i32` is not compatible,
because `u32.add` does not handle signed integers, not to mention there
is no way to handle negative results.

This commit also has a couple small tweaks to how a few of the predicate
functions are expressed to suppress some clippy warnings.
This commit extracts out the handle of stack affects for the inline
assembly builders, and addresses a number of places in which the types
were either not validated, or were incorrectly validated/applied.
@bitwalker bitwalker merged commit ae9736e into main Sep 14, 2023
2 checks passed
@bitwalker bitwalker deleted the bitwalker/inline-asm branch September 14, 2023 04:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants