From 46c58843eb8513602094d4476712e956ea764e85 Mon Sep 17 00:00:00 2001 From: Andreas Rossberg Date: Wed, 3 Nov 2021 11:46:33 +0100 Subject: [PATCH 1/7] Add type dynamic --- spec/Candid.md | 55 +++++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 50 insertions(+), 5 deletions(-) diff --git a/spec/Candid.md b/spec/Candid.md index 576ac1c6..df284801 100644 --- a/spec/Candid.md +++ b/spec/Candid.md @@ -96,6 +96,7 @@ This is a summary of the grammar proposed: | vec | record { ;* } | variant { ;* } + | dynamic ::= | func @@ -434,6 +435,27 @@ type tree = variant { } ``` +#### Dynamic + +The type `dynamic` represents a value of *dynamic* type. That is, the actual type of such a value is not fixed statically and can be anything at runtime. This can be used, for example, to express generic interfaces. + +``` + ::= ... | dynamic | ... +``` + +##### Example + +The following interface to a key/value store would allow storing any Candid value. +``` +type key = text; +type value = dynamic; + +service store : { + put : (key, value) -> (); + get : (key) -> (?value); +}; +``` + ### References @@ -853,6 +875,16 @@ variant { : ; ;* } <: variant { : ; *Note:* By virtue of the rules around `opt` above, it is possible to evolve and extend variant types that also occur in outbound position (i.e., are used both as function results and function parameters) by *adding* tags to variants, provided the variant itself is optional (e.g. `opt variant { 0 : nat; 1 : bool } <: opt variant { 1 : bool }`). Any party not aware of the extension will treat the new case as `null`. +#### Dynamic + +The dynamic type does not exhibit any interesting subtyping rules. +``` + +------------------ +dynamic <: dynamic +``` + + #### Functions For a specialised function, any parameter type can be generalised and any result type specialised. Moreover, arguments can be dropped while results can be added. That is, the rules mirror those of tuple-like records, i.e., they are ordered and can only be extended at the end. @@ -883,7 +915,7 @@ service { : ; ;* } <: service { : ; ### Coercion -This subtyping is implemented during the deserialisation of Candid at an expected type. As described in [Section Deserialisation](#deserialisation), the binary value is conceptually first _decoded_ into the actual type and a value of that type, and then that value is _coerced_ into the expected type. +The defined subtyping is implemented during the deserialisation of Candid at an expected type. As described in [Section Deserialisation](#deserialisation), the binary value is conceptually first _decoded_ into the actual type and a value of that type, and then that value is _coerced_ into the expected type. To model this, we define, for every `t1, t2` with `t1 <: t2`, a function `C[t1<:t2] : t1 -> t2`. This function maps values of type `t1` to values of type `t2`, and is indeed total. @@ -891,7 +923,7 @@ to describe these values, we re-use the syntax of the textual representation, an #### Primitive Types -On primitve types, coercion is the identity: +On primitive types, coercion is the identity: ``` C[ <: ](x) = x for every , bool, text, null ``` @@ -964,6 +996,13 @@ C[variant { = ; _;* } <: variant { = ; _;* }](variant { = variant { = C[ <: ]() } ``` +#### Dynamic + +On dynamic types, coercion is again the identity: +``` +C[dynamic <: dynamic](x) = x +``` + #### References @@ -1058,12 +1097,12 @@ Serialisation is defined by three functions `T`, `M`, and `R` given below. Most Candid values are self-explanatory, except for references. There are two forms of Candid values for service references and principal references: -* `ref(r)` indicates an opaque reference, understood only by the underlying system. +* `ref(r)`, indicates an opaque reference, understood only by the underlying system. * `id(b)`, indicates a transparent reference to a service addressed by the blob `b`. Likewise, there are two forms of Candid values for function references: -* `ref(r)` indicates an opaque reference, understood only by the underlying system. +* `ref(r)`, indicates an opaque reference, understood only by the underlying system. * `pub(s,n)`, indicates the public method name `n` of the service referenced by `s`. #### Notation @@ -1111,6 +1150,7 @@ T(opt ) = sleb128(-18) I() // 0x6e T(vec ) = sleb128(-19) I() // 0x6d T(record {^N}) = sleb128(-20) T*(^N) // 0x6c T(variant {^N}) = sleb128(-21) T*(^N) // 0x6b +T(dynamic) = sleb128(-25) i8(0) // 0x67 T : -> i8* T(:) = leb128() I() @@ -1160,6 +1200,7 @@ Note: `M` maps an Candid value to a byte sequence representing that value. The definition is indexed by type. We assume that the fields in a record value are sorted by increasing id. +For values serialised at type `dynamic`, we assume that their concrete type `t` can either be determined from the value or is known from context. ``` M : -> -> i8* @@ -1180,6 +1221,7 @@ M(?v : opt ) = i8(1) M(v : ) M(v* : vec ) = leb128(N) M(v : )* M(kv* : record {*}) = M(kv : )* M(kv : variant {*}) = leb128(i) M(kv : *[i]) +M(v:t : dynamic) = T(t) M(v : t) M : (, ) -> -> i8* M((k,v) : k:) = M(v : ) @@ -1211,6 +1253,7 @@ R(?v : opt ) = R(v : ) R(v* : vec ) = R(v : )* R(kv* : record {*}) = R(kv : )* R(kv : variant {*}) = R(kv : *[i]) +R(v:t : dynamic) = R(v : t) R : (, ) -> -> * R((k,v) : k:) = R(v : ) @@ -1265,12 +1308,14 @@ Deserialisation at an expected type sequence `(,*)` proceeds by Deserialisation uses the following mechanism for robustness towards future extensions: -* A serialised type may be headed by an opcode other than the ones defined above (i.e., less than -24). Any such opcode is followed by an LEB128-encoded count, and then a number of bytes corresponding to this count. A type represented that way is called a *future type*. +* A serialised type may be headed by an other than -1 to -24 . Any such opcode is followed by an LEB128-encoded count, and then a number of bytes corresponding to this count. A type represented that way is called a *future type*. * A value corresponding to a future type is called a *future value*. It is represented by two LEB128-encoded counts, *m* and *n*, followed by a *m* bytes in the memory representation M and accompanied by *n* corresponding references in R. These measures allow the serialisation format to be extended with new types in the future, as long as their representation and the representation of the corresponding values include a length prefix matching the above scheme, and thereby allowing an older deserialiser not understanding them to skip over them. The subtyping rules ensure that upgradability is maintained in this situation, i.e., an old deserialiser has no need to understand the encoded data. +The type `dynamic` is the only future type so far. + ## Open Questions From b305e502bf229cbe19ea02bfa747a51c1938ad67 Mon Sep 17 00:00:00 2001 From: Andreas Rossberg Date: Wed, 3 Nov 2021 11:55:14 +0100 Subject: [PATCH 2/7] Change encoding --- spec/Candid.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/spec/Candid.md b/spec/Candid.md index df284801..7bc5021a 100644 --- a/spec/Candid.md +++ b/spec/Candid.md @@ -1200,7 +1200,6 @@ Note: `M` maps an Candid value to a byte sequence representing that value. The definition is indexed by type. We assume that the fields in a record value are sorted by increasing id. -For values serialised at type `dynamic`, we assume that their concrete type `t` can either be determined from the value or is known from context. ``` M : -> -> i8* @@ -1221,7 +1220,7 @@ M(?v : opt ) = i8(1) M(v : ) M(v* : vec ) = leb128(N) M(v : )* M(kv* : record {*}) = M(kv : )* M(kv : variant {*}) = leb128(i) M(kv : *[i]) -M(v:t : dynamic) = T(t) M(v : t) +M(v:t : dynamic) = B((0,v) : t) M : (, ) -> -> i8* M((k,v) : k:) = M(v : ) @@ -1237,6 +1236,9 @@ M(ref(r) : principal) = i8(0) M(id(v*) : principal) = i8(1) M(v* : vec nat8) ``` +Note: For values serialised at type `dynamic`, we assume that their concrete type `t` can either be determined from the value or is known from context. +Such a value is serialised as a nested, self-contained Candid blob, as defined by the meta-function `B` below (#parameters-and-results). + #### References From 24fa1c190761f9e0481782a80726e2f93eb433c0 Mon Sep 17 00:00:00 2001 From: Andreas Rossberg Date: Wed, 3 Nov 2021 12:08:35 +0100 Subject: [PATCH 3/7] Fix future type rep --- spec/Candid.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/spec/Candid.md b/spec/Candid.md index 7bc5021a..883272d2 100644 --- a/spec/Candid.md +++ b/spec/Candid.md @@ -1220,7 +1220,7 @@ M(?v : opt ) = i8(1) M(v : ) M(v* : vec ) = leb128(N) M(v : )* M(kv* : record {*}) = M(kv : )* M(kv : variant {*}) = leb128(i) M(kv : *[i]) -M(v:t : dynamic) = B((0,v) : t) +M(v:t : dynamic) = leb128(|B((0,v) : t)|) leb128(|R(v : t)|) B((0,v) : t) M : (, ) -> -> i8* M((k,v) : k:) = M(v : ) From 569e1f2d68ddeb40c3e738d43fccba66515e14fe Mon Sep 17 00:00:00 2001 From: Andreas Rossberg Date: Thu, 11 Nov 2021 08:07:16 +0100 Subject: [PATCH 4/7] Typo --- spec/Candid.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/spec/Candid.md b/spec/Candid.md index 883272d2..d0acb836 100644 --- a/spec/Candid.md +++ b/spec/Candid.md @@ -1310,7 +1310,7 @@ Deserialisation at an expected type sequence `(,*)` proceeds by Deserialisation uses the following mechanism for robustness towards future extensions: -* A serialised type may be headed by an other than -1 to -24 . Any such opcode is followed by an LEB128-encoded count, and then a number of bytes corresponding to this count. A type represented that way is called a *future type*. +* A serialised type may be headed by an opcode other than -1 to -24 . Any such opcode is followed by an LEB128-encoded count, and then a number of bytes corresponding to this count. A type represented that way is called a *future type*. * A value corresponding to a future type is called a *future value*. It is represented by two LEB128-encoded counts, *m* and *n*, followed by a *m* bytes in the memory representation M and accompanied by *n* corresponding references in R. From 751bdf6b1138b9cc6b249b66afa48ab64718a83b Mon Sep 17 00:00:00 2001 From: Andreas Rossberg Date: Fri, 3 Dec 2021 14:00:57 +0100 Subject: [PATCH 5/7] Gradual typing --- spec/Candid.md | 35 +++++++++++++++++++++++++++++------ 1 file changed, 29 insertions(+), 6 deletions(-) diff --git a/spec/Candid.md b/spec/Candid.md index d0acb836..c3173fe2 100644 --- a/spec/Candid.md +++ b/spec/Candid.md @@ -187,6 +187,7 @@ service : { **Note:** In a synchronous interpretation of functions, invocation of a oneway function would return immediately, without waiting for completion of the service-side invocation of the function. In an asynchronous interpretation of functions, the invocation of a `oneway` function does not accept a callback (to invoke on completion). + #### Structure A function type describes the list of parameters and results and their respective types. It can optionally be annotated to be *query*, which indicates that it does not modify any state and can potentially be executed more efficiently (e.g., on cached state). (Other annotations may be added in the future.) @@ -443,6 +444,9 @@ The type `dynamic` represents a value of *dynamic* type. That is, the actual typ ::= ... | dynamic | ... ``` +The type `dynamic` is convertable to and from any other type, in the spirit of *gradual* typing. In particular, this allows specifying generic [functions](#function-references) as parameters, such that any concrete function can be supplied. + + ##### Example The following interface to a key/value store would allow storing any Candid value. @@ -453,8 +457,10 @@ type value = dynamic; service store : { put : (key, value) -> (); get : (key) -> (?value); + foreach : (f : func (key, value) -> ()) -> (); }; ``` +Note: Any unary function can be passed to `map`. For example, a client might use this service to store values of type `nat`, and invoke `map` with a function of type `(text, nat) -> ()`. ### References @@ -589,6 +595,7 @@ The types of these values are assumed to be known from context, so the syntax do | vec { ;* } | record { ;* } | variant { } + | dynamic : ::= = @@ -877,11 +884,19 @@ variant { : ; ;* } <: variant { : ; #### Dynamic -The dynamic type does not exhibit any interesting subtyping rules. +The dynamic type is reflexive, but also interchangeable with any other data type in both directions, which may involve a runtime check. This amounts to gradual typing. ``` ------------------ dynamic <: dynamic + + +--------------------- +dynamic <: + + +--------------------- + <: dynamic ``` @@ -998,10 +1013,19 @@ C[variant { = ; _;* } <: variant { = ; _;* }](variant { #### Dynamic -On dynamic types, coercion is again the identity: +On the dynamic type, coercion is the identity: ``` C[dynamic <: dynamic](x) = x ``` +Any data type can be coerced to `dynamic`: +``` +C[ <: dynamic]() = dynamic : if =/= dynamic +``` +The inverse direction is only possible if the encapsulated value matches the target type, such that the corresponding coercions is defined: +``` +C[dynamic <: ](dynamic : ) = C[ <: ]() if =/= dynamic +``` +Note: Type `` is not known statically, so it cannot be decided statically whether `C[ <: ]` is defined. Hence this amouts to a runtime type check. #### References @@ -1157,7 +1181,7 @@ T(:) = leb128() I() T : -> i8* T(func (*) -> (*) *) = - sleb128(-22) T*(*) T*(*) T*(*) // 0x6a + sleb128(-22) T*(*) T*(*) T*(*) // 0x6a T(service {*}) = sleb128(-23) T*(*) // 0x69 @@ -1220,7 +1244,7 @@ M(?v : opt ) = i8(1) M(v : ) M(v* : vec ) = leb128(N) M(v : )* M(kv* : record {*}) = M(kv : )* M(kv : variant {*}) = leb128(i) M(kv : *[i]) -M(v:t : dynamic) = leb128(|B((0,v) : t)|) leb128(|R(v : t)|) B((0,v) : t) +M(dynamic v:t : dynamic) = leb128(|B((0,v) : t)|) leb128(|R(v : t)|) B((0,v) : t) M : (, ) -> -> i8* M((k,v) : k:) = M(v : ) @@ -1236,8 +1260,7 @@ M(ref(r) : principal) = i8(0) M(id(v*) : principal) = i8(1) M(v* : vec nat8) ``` -Note: For values serialised at type `dynamic`, we assume that their concrete type `t` can either be determined from the value or is known from context. -Such a value is serialised as a nested, self-contained Candid blob, as defined by the meta-function `B` below (#parameters-and-results). +Note: The type `dynamic` is serialised as a nested, self-contained Candid blob, as defined by the meta-function `B` below (#parameters-and-results). #### References From 14261c7970cc9c6f1f064b8f0ba6d8e4ff3e0ce0 Mon Sep 17 00:00:00 2001 From: Andreas Rossberg Date: Wed, 8 Dec 2021 09:54:13 +0100 Subject: [PATCH 6/7] Separate graduality from subtyping --- spec/Candid.md | 31 ++++++++++++++----------------- 1 file changed, 14 insertions(+), 17 deletions(-) diff --git a/spec/Candid.md b/spec/Candid.md index c3173fe2..3d72c421 100644 --- a/spec/Candid.md +++ b/spec/Candid.md @@ -884,17 +884,9 @@ variant { : ; ;* } <: variant { : ; #### Dynamic -The dynamic type is reflexive, but also interchangeable with any other data type in both directions, which may involve a runtime check. This amounts to gradual typing. +Any data type can be turned into a dynamic type. ``` ------------------- -dynamic <: dynamic - - ---------------------- -dynamic <: - - --------------------- <: dynamic ``` @@ -1017,15 +1009,10 @@ On the dynamic type, coercion is the identity: ``` C[dynamic <: dynamic](x) = x ``` -Any data type can be coerced to `dynamic`: +Any other data type can be coerced to `dynamic`: ``` C[ <: dynamic]() = dynamic : if =/= dynamic ``` -The inverse direction is only possible if the encapsulated value matches the target type, such that the corresponding coercions is defined: -``` -C[dynamic <: ](dynamic : ) = C[ <: ]() if =/= dynamic -``` -Note: Type `` is not known statically, so it cannot be decided statically whether `C[ <: ]` is defined. Hence this amouts to a runtime type check. #### References @@ -1244,7 +1231,10 @@ M(?v : opt ) = i8(1) M(v : ) M(v* : vec ) = leb128(N) M(v : )* M(kv* : record {*}) = M(kv : )* M(kv : variant {*}) = leb128(i) M(kv : *[i]) -M(dynamic v:t : dynamic) = leb128(|B((0,v) : t)|) leb128(|R(v : t)|) B((0,v) : t) + +M(dynamic v:t : dynamic) = leb128(|B((0,v) : t)|) leb128(|R(v : t)|) B((0,v) : t) +M(dynamic v:t : ) = M(C[t <: ](v) : ) if t <: =/= dynamic +M(v : dynamic) = M(dynamic v:t : dynamic) if v:t and v =/= dynamic v':t' M : (, ) -> -> i8* M((k,v) : k:) = M(v : ) @@ -1262,6 +1252,10 @@ M(id(v*) : principal) = i8(1) M(v* : vec nat8) Note: The type `dynamic` is serialised as a nested, self-contained Candid blob, as defined by the meta-function `B` below (#parameters-and-results). +A dynamic value can also be serialised with a regular type, as long as the types match; this amounts to a runtime type check and implicitly unwraps the value. +Inversely, a value of regular non-dynamic type can be serialised with type `dynamic`; this implicitly wraps the value (we assume here that the value's type `t` can be determined from the value or is known from context). +Together, the latter two rules implement a form of *gradual typing* for type `dynamic`. + #### References @@ -1278,7 +1272,10 @@ R(?v : opt ) = R(v : ) R(v* : vec ) = R(v : )* R(kv* : record {*}) = R(kv : )* R(kv : variant {*}) = R(kv : *[i]) -R(v:t : dynamic) = R(v : t) + +R(dynamic v:t : dynamic) = R(v : t) +R(dynamic v:t : ) = R(C[t <: ](v) : ) if t <: =/= dynamic +R(v : dynamic) = R(dynamic v:t : dynamic) if v:t and v =/= dynamic v':t' R : (, ) -> -> * R((k,v) : k:) = R(v : ) From 585e694ffac53add56afbd2ff9c0de8c9e9a56c4 Mon Sep 17 00:00:00 2001 From: Andreas Rossberg Date: Wed, 8 Dec 2021 10:17:32 +0100 Subject: [PATCH 7/7] Avoid non-det overlap in rules --- spec/Candid.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/spec/Candid.md b/spec/Candid.md index 3d72c421..460ac342 100644 --- a/spec/Candid.md +++ b/spec/Candid.md @@ -1221,7 +1221,7 @@ M(i : int) = i(signed_N^-1(i)) M(z : float) = f(z) M(b : bool) = i8(if b then 1 else 0) M(t : text) = leb128(|utf8(t)|) i8*(utf8(t)) -M(_ : null) = . +M(null : null) = . M(_ : reserved) = . // NB: M(_ : empty) will never be called @@ -1264,7 +1264,7 @@ We assume that the fields in a record value are sorted by increasing id. ``` R : -> -> * -R(_ : ) = . +R( : ) = . R : -> -> * R(null : opt ) = .