diff --git a/spec/Candid.md b/spec/Candid.md index 576ac1c6..460ac342 100644 --- a/spec/Candid.md +++ b/spec/Candid.md @@ -96,6 +96,7 @@ This is a summary of the grammar proposed: | vec | record { ;* } | variant { ;* } + | dynamic ::= | func @@ -186,6 +187,7 @@ service : { **Note:** In a synchronous interpretation of functions, invocation of a oneway function would return immediately, without waiting for completion of the service-side invocation of the function. In an asynchronous interpretation of functions, the invocation of a `oneway` function does not accept a callback (to invoke on completion). + #### Structure A function type describes the list of parameters and results and their respective types. It can optionally be annotated to be *query*, which indicates that it does not modify any state and can potentially be executed more efficiently (e.g., on cached state). (Other annotations may be added in the future.) @@ -434,6 +436,32 @@ type tree = variant { } ``` +#### Dynamic + +The type `dynamic` represents a value of *dynamic* type. That is, the actual type of such a value is not fixed statically and can be anything at runtime. This can be used, for example, to express generic interfaces. + +``` + ::= ... | dynamic | ... +``` + +The type `dynamic` is convertable to and from any other type, in the spirit of *gradual* typing. In particular, this allows specifying generic [functions](#function-references) as parameters, such that any concrete function can be supplied. + + +##### Example + +The following interface to a key/value store would allow storing any Candid value. +``` +type key = text; +type value = dynamic; + +service store : { + put : (key, value) -> (); + get : (key) -> (?value); + foreach : (f : func (key, value) -> ()) -> (); +}; +``` +Note: Any unary function can be passed to `map`. For example, a client might use this service to store values of type `nat`, and invoke `map` with a function of type `(text, nat) -> ()`. + ### References @@ -567,6 +595,7 @@ The types of these values are assumed to be known from context, so the syntax do | vec { ;* } | record { ;* } | variant { } + | dynamic : ::= = @@ -853,6 +882,16 @@ variant { : ; ;* } <: variant { : ; *Note:* By virtue of the rules around `opt` above, it is possible to evolve and extend variant types that also occur in outbound position (i.e., are used both as function results and function parameters) by *adding* tags to variants, provided the variant itself is optional (e.g. `opt variant { 0 : nat; 1 : bool } <: opt variant { 1 : bool }`). Any party not aware of the extension will treat the new case as `null`. +#### Dynamic + +Any data type can be turned into a dynamic type. +``` + +--------------------- + <: dynamic +``` + + #### Functions For a specialised function, any parameter type can be generalised and any result type specialised. Moreover, arguments can be dropped while results can be added. That is, the rules mirror those of tuple-like records, i.e., they are ordered and can only be extended at the end. @@ -883,7 +922,7 @@ service { : ; ;* } <: service { : ; ### Coercion -This subtyping is implemented during the deserialisation of Candid at an expected type. As described in [Section Deserialisation](#deserialisation), the binary value is conceptually first _decoded_ into the actual type and a value of that type, and then that value is _coerced_ into the expected type. +The defined subtyping is implemented during the deserialisation of Candid at an expected type. As described in [Section Deserialisation](#deserialisation), the binary value is conceptually first _decoded_ into the actual type and a value of that type, and then that value is _coerced_ into the expected type. To model this, we define, for every `t1, t2` with `t1 <: t2`, a function `C[t1<:t2] : t1 -> t2`. This function maps values of type `t1` to values of type `t2`, and is indeed total. @@ -891,7 +930,7 @@ to describe these values, we re-use the syntax of the textual representation, an #### Primitive Types -On primitve types, coercion is the identity: +On primitive types, coercion is the identity: ``` C[ <: ](x) = x for every , bool, text, null ``` @@ -964,6 +1003,17 @@ C[variant { = ; _;* } <: variant { = ; _;* }](variant { = variant { = C[ <: ]() } ``` +#### Dynamic + +On the dynamic type, coercion is the identity: +``` +C[dynamic <: dynamic](x) = x +``` +Any other data type can be coerced to `dynamic`: +``` +C[ <: dynamic]() = dynamic : if =/= dynamic +``` + #### References @@ -1058,12 +1108,12 @@ Serialisation is defined by three functions `T`, `M`, and `R` given below. Most Candid values are self-explanatory, except for references. There are two forms of Candid values for service references and principal references: -* `ref(r)` indicates an opaque reference, understood only by the underlying system. +* `ref(r)`, indicates an opaque reference, understood only by the underlying system. * `id(b)`, indicates a transparent reference to a service addressed by the blob `b`. Likewise, there are two forms of Candid values for function references: -* `ref(r)` indicates an opaque reference, understood only by the underlying system. +* `ref(r)`, indicates an opaque reference, understood only by the underlying system. * `pub(s,n)`, indicates the public method name `n` of the service referenced by `s`. #### Notation @@ -1111,13 +1161,14 @@ T(opt ) = sleb128(-18) I() // 0x6e T(vec ) = sleb128(-19) I() // 0x6d T(record {^N}) = sleb128(-20) T*(^N) // 0x6c T(variant {^N}) = sleb128(-21) T*(^N) // 0x6b +T(dynamic) = sleb128(-25) i8(0) // 0x67 T : -> i8* T(:) = leb128() I() T : -> i8* T(func (*) -> (*) *) = - sleb128(-22) T*(*) T*(*) T*(*) // 0x6a + sleb128(-22) T*(*) T*(*) T*(*) // 0x6a T(service {*}) = sleb128(-23) T*(*) // 0x69 @@ -1170,7 +1221,7 @@ M(i : int) = i(signed_N^-1(i)) M(z : float) = f(z) M(b : bool) = i8(if b then 1 else 0) M(t : text) = leb128(|utf8(t)|) i8*(utf8(t)) -M(_ : null) = . +M(null : null) = . M(_ : reserved) = . // NB: M(_ : empty) will never be called @@ -1181,6 +1232,10 @@ M(v* : vec ) = leb128(N) M(v : )* M(kv* : record {*}) = M(kv : )* M(kv : variant {*}) = leb128(i) M(kv : *[i]) +M(dynamic v:t : dynamic) = leb128(|B((0,v) : t)|) leb128(|R(v : t)|) B((0,v) : t) +M(dynamic v:t : ) = M(C[t <: ](v) : ) if t <: =/= dynamic +M(v : dynamic) = M(dynamic v:t : dynamic) if v:t and v =/= dynamic v':t' + M : (, ) -> -> i8* M((k,v) : k:) = M(v : ) @@ -1195,6 +1250,12 @@ M(ref(r) : principal) = i8(0) M(id(v*) : principal) = i8(1) M(v* : vec nat8) ``` +Note: The type `dynamic` is serialised as a nested, self-contained Candid blob, as defined by the meta-function `B` below (#parameters-and-results). + +A dynamic value can also be serialised with a regular type, as long as the types match; this amounts to a runtime type check and implicitly unwraps the value. +Inversely, a value of regular non-dynamic type can be serialised with type `dynamic`; this implicitly wraps the value (we assume here that the value's type `t` can be determined from the value or is known from context). +Together, the latter two rules implement a form of *gradual typing* for type `dynamic`. + #### References @@ -1203,7 +1264,7 @@ We assume that the fields in a record value are sorted by increasing id. ``` R : -> -> * -R(_ : ) = . +R( : ) = . R : -> -> * R(null : opt ) = . @@ -1212,6 +1273,10 @@ R(v* : vec ) = R(v : )* R(kv* : record {*}) = R(kv : )* R(kv : variant {*}) = R(kv : *[i]) +R(dynamic v:t : dynamic) = R(v : t) +R(dynamic v:t : ) = R(C[t <: ](v) : ) if t <: =/= dynamic +R(v : dynamic) = R(dynamic v:t : dynamic) if v:t and v =/= dynamic v':t' + R : (, ) -> -> * R((k,v) : k:) = R(v : ) @@ -1265,12 +1330,14 @@ Deserialisation at an expected type sequence `(,*)` proceeds by Deserialisation uses the following mechanism for robustness towards future extensions: -* A serialised type may be headed by an opcode other than the ones defined above (i.e., less than -24). Any such opcode is followed by an LEB128-encoded count, and then a number of bytes corresponding to this count. A type represented that way is called a *future type*. +* A serialised type may be headed by an opcode other than -1 to -24 . Any such opcode is followed by an LEB128-encoded count, and then a number of bytes corresponding to this count. A type represented that way is called a *future type*. * A value corresponding to a future type is called a *future value*. It is represented by two LEB128-encoded counts, *m* and *n*, followed by a *m* bytes in the memory representation M and accompanied by *n* corresponding references in R. These measures allow the serialisation format to be extended with new types in the future, as long as their representation and the representation of the corresponding values include a length prefix matching the above scheme, and thereby allowing an older deserialiser not understanding them to skip over them. The subtyping rules ensure that upgradability is maintained in this situation, i.e., an old deserialiser has no need to understand the encoded data. +The type `dynamic` is the only future type so far. + ## Open Questions