Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nested interfaces #372

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Conversation

macovedj
Copy link

Here is an outline for the various ways in which one could describe nested interfaces in wit.

For additional context re: motivation, see this issue, though generally what this enables is the ability to target nested instances which are expressible in wat and binary formats, and not in wit.

An initial implementation that still needs some work can be found here, though at the moment, it would only add support for the case of nesting interfaces from foreign packages.

Perhaps this PR can also be a place to hash out a bit of what the implications would be from a guest lang bindgen perspective as well.

Copy link
Collaborator

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Could this also update the pseudo-ebnf below too?

design/mvp/WIT.md Outdated Show resolved Hide resolved
Comment on lines +678 to +680
baz: interface {
...
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Syntactically I might bikeshed this as nest baz { ... } perhaps? (to keep the word "nest" around it and not have too too many keywords)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah funny, that's what I had originally, and changed it to this after discussing a bit with @lukewagner. Personally, I'm not so attached to either, but his thoughts I believe were that the syntax as I have it currently would match how anonymous interfaces in worlds are expressed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd also be ok with nest a: interface { ... } but I do think that we'll want nest somewhere in here to align with the other nest syntaxes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see both arguments, but while writing nest a: interface { ... }; is more symmetric with nest foo:bar/baz; writing a: interface { ... }; is more symmetric with a: func() -> blah; (in interfaces) and import a: interface { ... }; (in worlds). Since the reason for having nest in the first place is because <interfacename>; is lexically ambiguous (due to the : inside <interfacename>) and also looks odd, and since with a: interface { ... }; there's no lexical ambiguity and it looks roughly like what it means in worlds, my vague preference is to go with the latter symmetry.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left as is for now, though I noticed Luke was using semicolons for anonymous interfaces, while we weren't using them before... not sure if people feel one way or another about that, but I don't think it's ambiguous to leave them out, similarly to how records are handled.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, I hadn't noticed the precedent for anonymous interfaces in worlds was to leave off the semicolon, so I'd go with that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coincidentally, working through the proposed additions in #308 made me realize that a semicolon after anonymous interfaces (in both interfaces and worlds) is probably a good idea for long-term future compatibility. See e.g. the code examples in this comment where I think you end up wanting the ; always.

@rossberg
Copy link
Member

rossberg commented Jun 26, 2024

It’s great seeing this expressivity gap in WIT addressed. However, I don’t think the proposed extension is the right solution for this. It is very special-cased and fails to address closely related or more general use cases. For example, it's quite natural to want to express an interface of the form

    interface {
       a1 : A
       a2 : A 
    }

where both sub elements have the same interface A, but without also defining a standalone A on the outer level. Similarly, a common case are components like

    world {
        import i : A;
        export e : A;
    }

where the same signature occurs for a generic import and an export. Or perhaps even only two imports:

    world {
        import a1 : A;
        import a2 : A;
    }

And endless other variations.

The actual root of the problem is that WIT — unlike the raw component system — conflates two entirely different forms of declaration, namely, the named definition of an interface type (an actual interface per se) and declaring the presence of an instance of an interface (the implementation of an interface).

For perspective, compare this to a regular programming language, where we usually distinguish

    type T = struct { a : A; b : B }

from

    export T : struct { a : A; b : B }

and these are entirely different things, declaring very different categories of T's. I doubt anybody would consider merging both into a single form of declaration a helpful thing to do.

An addition like the one proposed in this PR is not fixing this underlying category error, but arguably makes the conflation worse, very likely requiring further ad-hoc work-around features in the future.

The more adequate and scalable solution would be to properly separate these notions in WIT as well. That is, introduce a new form of declaration for naming an interface type, that then can be used in all places where an interface description would occur, but does not by itself declare an instance.

The example from the PR would become something along the lines of

    interface type Foo {
        ...
    }

    interface foo : Foo

    interface top {
        foo : Foo;
        bar : foreign:pkg/Bar;
        baz : interface {
            ...
        }
    }

but importantly, you don’t have to declare the existence of a global foo if all you actually want is to name the type Foo, because it's used in multiple places.

@macovedj
Copy link
Author

Hmm, can we talk about this in terms of how it would be encoded?
Given that today the interface syntax is encoded as follows:

interface foo {
  ...
}
(component
  (type (;0;)(instance 
  ...
  ))
  (export "ns:id/foo" (type 0))
)

We can see that interface is currently encoded as a type that is exported.

Note in the proposal, the nest keyword is exporting an instance rather than a type, which is a difference between nested interfaces and top level interfaces.

I may be misinterpreting, but it feels like the interface keyword currently behaves how I would intuitively expect the proposed interface type syntax to behave, and it feels like nest is implicitly operating in a way that feels similar to foo: Foo.

Given the current usage of interface, if we were to address the larger concerns here, would it maybe make more sense to go the opposite direction?

Thinking of interface as operating currently how interface type is proposed to behave, instead we could do something like

interface foo-type {
  ...
} 

foo: instance foo-type;

interface top {
  foo-inst: foo,
  top-foo-type: foo-type
}

Or using @rossberg's initial suggestion, not even introduce the instance keyword and just have

foo: foo-type

but allow it to act on an interface rather than an interface type

Then top has the liberty of exporting either or both top-foo-type as a type like what we currently see in top level interfaces, or a foo-inst as an instance, as is originally proposed in this PR. If I'm reading correctly, this feels like it accomplishes disambiguating without changing the current semantics of interface.

@lukewagner
Copy link
Member

@rossberg In your example WIT, you're giving all 3 nested interfaces a plainname, i.e., they turn into the following instance type:

(type $top (instance
  (export "foo" (instance ...))
  (export "bar" (instance ...))
  (export "baz" (instance ...))
))

That's fine, and we can quibble about the right syntax for addressing all the use cases you're getting at, but that is all quite independent from the use case we are trying to address in this PR which is that we want an export of an interfacename (that has been separately defined and named). e.g., we want:

(type $top (instance
  (export "wasi:http/incoming-handler" ...)
))

which is this PR is proposing looks like:

interface top {
  nest wasi:http/incoming-handler;
}

Notice that we are quite intentionally not assigning a locally made-up plainname; we want to export the nested instance by specifying only its interfacename. I don't think any of the syntax you're proposing addresses this use case?

Copy link
Member

@lukewagner lukewagner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this! Looks generally good to me; here's a few initial comments, I'll need to review a bit more carefully later.

design/mvp/WIT.md Outdated Show resolved Hide resolved
design/mvp/WIT.md Outdated Show resolved Hide resolved
Comment on lines +678 to +680
baz: interface {
...
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see both arguments, but while writing nest a: interface { ... }; is more symmetric with nest foo:bar/baz; writing a: interface { ... }; is more symmetric with a: func() -> blah; (in interfaces) and import a: interface { ... }; (in worlds). Since the reason for having nest in the first place is because <interfacename>; is lexically ambiguous (due to the : inside <interfacename>) and also looks odd, and since with a: interface { ... }; there's no lexical ambiguity and it looks roughly like what it means in worlds, my vague preference is to go with the latter symmetry.

@macovedj
Copy link
Author

macovedj commented Jul 1, 2024

Some of the other concerns raised in the PR that implements nesting foreign packages included if we've considered what this should look like from the perspective of guest bindings. Spitballing with rust, currently if we have the following wit

package foo:bar@1.0.0;

interface things {
  record my-record {
    foo: string
  }
}
world my-world {
  export things;
}

We end up with the following

#[allow(dead_code)]
pub mod exports {
  #[allow(dead_code)]
  pub mod foo {
    #[allow(dead_code)]
    pub mod bar {
      #[allow(dead_code, clippy::all)]
      pub mod things {
        ...
     }
   }
  }
}

Does it make sense to just continue to nest mods?
So using the examples above:

nest foo;
nest foreign:pkg/bar;
nest baz;
...

then things above could look as follows:

pub mod things {
  ...
  pub mod foo {
    ...
  }
  pub mod foreign {
    pub mod pkg {
      pub mod bar {
        ...
      }
    }  
  }
  pub mod baz {
    ...
  }
}

I'm guessing I may be missing something, though I'd guess that other languages namespace things based on packages/interfaces similarly...

@macovedj
Copy link
Author

macovedj commented Jul 7, 2024

I guess the other thoughts worth proposing for bindgen are more detailed mechanics of generating interfaces/traits.

I'd guess that if we have a situations where say if interface A has a generated trait definition GuestA and also nests interface B with a generated trait definition GuestB, then GuestA would probably have GuestB as an associated type, where the associated type is namespaced as outlined above.

@lukewagner
Copy link
Member

Good points! Because nested interfaces have a totally distinct identity (of contained types and functions) from any other nesting or top-level import/export of the same interfacename, I think you're right in your above Rust code to nest foreign:pkg/bar under things so that it's distinct from, e.g., an import foreign:pkg/bar in the same world. To your second question about traits, though: I think they would also need to be kept distinct so that there is a separate trait (which can have separate concrete resource types) for each nested occurrence of an interfacename.

Maybe not interesting, but one corner case worth considering is what if we have:

interface things {
  foreign: interface { ... };
  nest foreign:pkg/bar;
}

This is allowed b/c the first nested interface is given the plainname foreign and the second nested interface is given the interfacename foreign:pkg/bar (which are distinct and thus allowed) but with the above scheme, both will try to stick a pub mod foreign into the things module. I assume the right solution here is to handle this rare collision with some avoidance scheme (maybe add a suffix character to the foreign namespace). I believe this sort of collision also arises at world-level (replacing nest with import), so it'd be good to know what we do there and probably copy that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants