Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed length arrays #181

Open
LLFourn opened this issue Apr 11, 2023 · 24 comments · May be fixed by #384
Open

Fixed length arrays #181

LLFourn opened this issue Apr 11, 2023 · 24 comments · May be fixed by #384
Labels
enhancement New feature or request

Comments

@LLFourn
Copy link

LLFourn commented Apr 11, 2023

Rust has fixed length arrays whose type is written as [u8;32] for a 32-byte array. You can achieve this in the current wit specification by doing tuple<u8,u8 ... u8> (with 32 u8s). Could list<u8,32> or something be a first class type to avoid abusing tuples in this way even if it desugars to something similar?

@lukewagner
Copy link
Member

That does seem like a potentially good idea, and could be defined as another "specialization" (just like how tuple itself is a specialization of record). I feel like this was brought up in a previous issue or comment somewhere, but I can't seem to find it at the moment.

@sunfishcode
Copy link
Member

A similar issue that came up earlier was IPv6 addresses, which would want to use [u16; 8].

@lukewagner
Copy link
Member

Oh right, that was it, thanks. I suppose mostly this is just a question of prioritization and whether we think there are enough use cases to motivate adding this in the MVP time frame.

@cameron1024
Copy link

Just stumbled across this - I'd be very interested in these types for cryptographic hashes, in case there's a need for more motivating examples

@f5yacobucci
Copy link

Seconding @cameron1024 use case. I've been trying to understand wit, components, and wit-bindgen using the md5 algorithm. Being able to create md5.buffer: list<u8, 64> would be desireable.

@LLFourn
Copy link
Author

LLFourn commented Apr 27, 2023

This actually happened to be my use-case too. I have ids for certain resources that are referenced by host and guest that are the output of cryptographic hashes so always have a fixed length. It's a bit of a PITA having to check the length and copy the bytes into a fixed length array on both sides.

@lukewagner
Copy link
Member

It does seem like there are some good use cases for this and it's not a big addition if this is just a specialization of tuple, so it does seem like a good idea to add. That being said, I think maybe we should hold off on adding this before the Preview 2 milestone since there is already a bunch of other loose ends to wrap up in the milestone. But after that, I'd be happy to propose adding it.

@lukewagner lukewagner added the enhancement New feature or request label Apr 28, 2023
@badeend
Copy link
Contributor

badeend commented May 21, 2023

For reference: WebAssembly/interface-types#146. I think that issue was raised before the existence of tuples, though.

@ydnar
Copy link

ydnar commented Jan 16, 2024

I’m working on the Go bindings for WASI Preview 2 and the Component Model. The ipv4-address and ipv6-address types seem to be reasonable justification for fixed-with array types.

    type ipv4-address = tuple<u8, u8, u8, u8>;
    type ipv6-address = tuple<u16, u16, u16, u16, u16, u16, u16, u16>;

Could be:

    type ipv4-address = [8]u8;
    type ipv6-address = [8]u8;

(Go-style syntax)

It seems like a fixed-width array could just be syntactic sugar for tuple<t[, t...]> and despecialize into a record for the ABI.

@arjunr2
Copy link

arjunr2 commented Feb 14, 2024

Is there any update on supporting this feature? I have several similar use-cases to those pointed out above as well.

@badeend
Copy link
Contributor

badeend commented Feb 14, 2024

Do you have a specific use-case that tuples can't handle?

@arjunr2
Copy link

arjunr2 commented Feb 14, 2024

Tuples handle these use-cases but are rather messy to specify especially for larger fixed sizes. Something similar to @ydnar's solution would be useful.

@ydnar
Copy link

ydnar commented Feb 14, 2024

Monotypic tuples implemented as Go arrays here:

ydnar/wasm-tools-go@be654cd

image

@lukewagner
Copy link
Member

Now that Preview 2 has shipped, I think it makes sense to add this as an emoji-gated feature (to be ungated once widely implemented in Preview 2 runtimes and included in a subgroup-approved 0.2.* release). I can write a PR for this in a bit if noone does it first.

Syntactically in WIT and the C-M, I'm leaning toward list<T, N> (where N is an optional second operand to the list type constructor), since it extends the existing syntax for a list instead of creating a whole new syntactic construct, analogous to how other languages extend their existing list/array syntax with the optional length.

@ydnar
Copy link

ydnar commented Feb 15, 2024

I think the form of list<T, N> might be confusing, implying that a fixed-length array would despecialize into a list<T>, when in fact it has a completely different ABI representation (pass-by-value, no header with ptr and len). Given that the 2nd arg isn’t a type parameter, but a length, perhaps Go-style [n]t ([8]u16) or Rust style [u16;8]?

Other possible benefits of implementing fixed-length arrays:

  • Monotypic tuple<T[,T...]> would despecialize into fixed-length arrays of type T.
  • flags with > 64 flag values would despecialize into fixed-length arrays of u32.

@lukewagner
Copy link
Member

We already have the Canonical ABI varying significantly for specializations (e.g. flags is a specialization of record) and also for a single type constructor based on the immediate arguments (e.g., the byte-width of variants varies by the concrete number of cases), so I think we're not doing anything new, ABI-wise, by having list<T, N> have a totally different representation from list<T>. At the lifted semantics level, I think it does make sense to think of a fixed-length list as just a list: depending on your language, you may be able to ignore the N and think of the list<T,N> as just a list<T> (e.g., in JS, a list<T,N> would naturally produce a JS array, just like a list<T>).

@rossberg
Copy link
Member

  • Monotypic tuple<T[,T...]> would despecialize into fixed-length arrays of type T.

That's not what you want. Tuples are accessed with a static index and no runtime overhead, easily verifiable, whereas lists/arrays are accessed with a dynamic index and (usually) a runtime bounds check, at least in typed languages that distinguish the two. Different trade-offs that API designers supposedly pick consciously and an ABI should not mix up.

@ydnar
Copy link

ydnar commented Feb 16, 2024

  • Monotypic tuple<T[,T...]> would despecialize into fixed-length arrays of type T.

That's not what you want. Tuples are accessed with a static index and no runtime overhead, easily verifiable, whereas lists/arrays are accessed with a dynamic index and (usually) a runtime bounds check, at least in typed languages that distinguish the two. Different trade-offs that API designers supposedly pick consciously and an ABI should not mix up.

I think the gist of the OP (which I agree with) is that a fixed-length array is ABI identical to a tuple with a single type. Which isn’t the same as a list—as you correctly pointed out has a header, dynamic index, overhead, etc.

@rossberg
Copy link
Member

No, a list/array with fixed length is still indexed with a dynamic index, like any other list/array, which requires a check, like for any other list/array. The only difference is that you statically know one side of that check. A tuple is a different concept with no dynamic indexing nor checking.

@ydnar
Copy link

ydnar commented Feb 16, 2024

This thread is about Rust and Go-style fixed-length arrays, which are value types, not a list<t> with a fixed length, which would be represented as a slice or vector in Rust or Go.

Therefore the memory representation of a fixed-length array is identical to a tuple with a single type.

@lukewagner
Copy link
Member

Ah, I think I understand the disconnect: @ydnar is primarily concerned with the Canonical ABI representation here, specifically that, whatever we call it, our fixed-length-array is given the same "inline" representation as the N-ary tuples that it is replacing (and not a (ptr, length) pair pointing to separately-allocated memory). I didn't mention it above, but I agree this is the representation we want and this is totally doable even if we call it list<T, N> (the Canonical ABI is allowed to have totally different representations based on the particular operands of the type constructor).

lukewagner added a commit that referenced this issue Feb 19, 2024
lukewagner added a commit that referenced this issue Feb 19, 2024
lukewagner added a commit that referenced this issue Feb 19, 2024
lukewagner added a commit that referenced this issue Feb 19, 2024
@lukewagner
Copy link
Member

PR up in #304, PTAL

@dearfl
Copy link

dearfl commented Apr 9, 2024

I would approciate fix length list (like mac address, tuple<u8, u8, u8, u8, u8, u8> is just messy), however I believe fixed length array and dynamic length array would be very different from the ABI perspective?

like std::array<T, N> and std::vector<T> in C++?

std::array is N consecutive Ts lives on stack, while std::vector is a pointer point to a heap allocated buffer(along with a size & capacity member).

@lukewagner
Copy link
Member

Yes, the ABIs (as defined in #304) are totally different.

cpetig pushed a commit to cpetig/component-model that referenced this issue Aug 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants