Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for type dynamic #292

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
82 changes: 76 additions & 6 deletions spec/Candid.md
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,7 @@ This is a summary of the grammar proposed:
| vec <datatype>
| record { <fieldtype>;* }
| variant { <fieldtype>;* }
| dynamic

<reftype> ::=
| func <functype>
Expand Down Expand Up @@ -186,6 +187,7 @@ service : {

**Note:** In a synchronous interpretation of functions, invocation of a oneway function would return immediately, without waiting for completion of the service-side invocation of the function. In an asynchronous interpretation of functions, the invocation of a `oneway` function does not accept a callback (to invoke on completion).


#### Structure

A function type describes the list of parameters and results and their respective types. It can optionally be annotated to be *query*, which indicates that it does not modify any state and can potentially be executed more efficiently (e.g., on cached state). (Other annotations may be added in the future.)
Expand Down Expand Up @@ -434,6 +436,32 @@ type tree = variant {
}
```

#### Dynamic

The type `dynamic` represents a value of *dynamic* type. That is, the actual type of such a value is not fixed statically and can be anything at runtime. This can be used, for example, to express generic interfaces.

```
<constype> ::= ... | dynamic | ...
```

The type `dynamic` is convertable to and from any other type, in the spirit of *gradual* typing. In particular, this allows specifying generic [functions](#function-references) as parameters, such that any concrete function can be supplied.


##### Example

The following interface to a key/value store would allow storing any Candid value.
```
type key = text;
type value = dynamic;

service store : {
put : (key, value) -> ();
get : (key) -> (?value);
foreach : (f : func (key, value) -> ()) -> ();
};
```
Note: Any unary function can be passed to `map`. For example, a client might use this service to store values of type `nat`, and invoke `map` with a function of type `(text, nat) -> ()`.


### References

Expand Down Expand Up @@ -567,6 +595,7 @@ The types of these values are assumed to be known from context, so the syntax do
| vec { <annval>;* }
| record { <fieldval>;* }
| variant { <fieldval> }
| dynamic <val> : <datatype>

<fieldval> ::= <nat> = <annval>

Expand Down Expand Up @@ -853,6 +882,24 @@ variant { <nat> : <datatype>; <fieldtype>;* } <: variant { <nat> : <datatype'>;
*Note:* By virtue of the rules around `opt` above, it is possible to evolve and extend variant types that also occur in outbound position (i.e., are used both as function results and function parameters) by *adding* tags to variants, provided the variant itself is optional (e.g. `opt variant { 0 : nat; 1 : bool } <: opt variant { 1 : bool }`). Any party not aware of the extension will treat the new case as `null`.


#### Dynamic

The dynamic type is reflexive, but also interchangeable with any other data type in both directions, which may involve a runtime check. This amounts to gradual typing.
```

------------------
dynamic <: dynamic


---------------------
dynamic <: <datatype>


---------------------
<datatype> <: dynamic
```


#### Functions

For a specialised function, any parameter type can be generalised and any result type specialised. Moreover, arguments can be dropped while results can be added. That is, the rules mirror those of tuple-like records, i.e., they are ordered and can only be extended at the end.
Expand Down Expand Up @@ -883,15 +930,15 @@ service { <name> : <functype>; <methtype>;* } <: service { <name> : <functype'>;

### Coercion

This subtyping is implemented during the deserialisation of Candid at an expected type. As described in [Section Deserialisation](#deserialisation), the binary value is conceptually first _decoded_ into the actual type and a value of that type, and then that value is _coerced_ into the expected type.
The defined subtyping is implemented during the deserialisation of Candid at an expected type. As described in [Section Deserialisation](#deserialisation), the binary value is conceptually first _decoded_ into the actual type and a value of that type, and then that value is _coerced_ into the expected type.

To model this, we define, for every `t1, t2` with `t1 <: t2`, a function `C[t1<:t2] : t1 -> t2`. This function maps values of type `t1` to values of type `t2`, and is indeed total.

to describe these values, we re-use the syntax of the textual representation, and use the the `<annval>` syntax (i.e. `(v : t)`) if necessary to resolve overloading.

#### Primitive Types

On primitve types, coercion is the identity:
On primitive types, coercion is the identity:
```
C[<t> <: <t>](x) = x for every <t> ∈ <numtype>, bool, text, null
```
Expand Down Expand Up @@ -964,6 +1011,22 @@ C[variant { <nat> = <t>; _;* } <: variant { <nat> = <t'>; _;* }](variant { <nat>
= variant { <nat> = C[<t> <: <t'>](<v>) }
```

#### Dynamic

On the dynamic type, coercion is the identity:
```
C[dynamic <: dynamic](x) = x
```
Any data type can be coerced to `dynamic`:
```
C[<t> <: dynamic](<v>) = dynamic <v> : <t> if <t> =/= dynamic
```
The inverse direction is only possible if the encapsulated value matches the target type, such that the corresponding coercions is defined:
```
C[dynamic <: <t>](dynamic <v'> : <t'>) = C[<t'> <: <t>](<v'>) if <t> =/= dynamic
```
Note: Type `<t>` is not known statically, so it cannot be decided statically whether `C[<t'> <: <t>]` is defined. Hence this amouts to a runtime type check.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a bit too sneaky; if we go this route we should explicitly define this function to be partial (it was total before), with suitable notation. Also the interaction with opt stuff needs to be clear, especially if we go back to making that more dynamic again.

How does this affect our formal guarantees? They can only hold when no dynamic is in use, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. It's probably necessary to distinguish gradual conversion from subtyping anyway, because transitively, subtyping collapses with the above rules, i.e., everything becomes a subtype of everything.

Copy link
Contributor Author

@rossberg rossberg Dec 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nomeata, you are right, this was taking at least one shortcut too many. I pushed a change that should properly separate graduality from subtyping and coercions. Subtyping now only allows t <: dynamic, but not the other way round. Graduality itself is handled in de/serialisation itself. Let me know what you think.



#### References

Expand Down Expand Up @@ -1058,12 +1121,12 @@ Serialisation is defined by three functions `T`, `M`, and `R` given below.

Most Candid values are self-explanatory, except for references. There are two forms of Candid values for service references and principal references:

* `ref(r)` indicates an opaque reference, understood only by the underlying system.
* `ref(r)`, indicates an opaque reference, understood only by the underlying system.
* `id(b)`, indicates a transparent reference to a service addressed by the blob `b`.

Likewise, there are two forms of Candid values for function references:

* `ref(r)` indicates an opaque reference, understood only by the underlying system.
* `ref(r)`, indicates an opaque reference, understood only by the underlying system.
* `pub(s,n)`, indicates the public method name `n` of the service referenced by `s`.

#### Notation
Expand Down Expand Up @@ -1111,13 +1174,14 @@ T(opt <datatype>) = sleb128(-18) I(<datatype>) // 0x6e
T(vec <datatype>) = sleb128(-19) I(<datatype>) // 0x6d
T(record {<fieldtype>^N}) = sleb128(-20) T*(<fieldtype>^N) // 0x6c
T(variant {<fieldtype>^N}) = sleb128(-21) T*(<fieldtype>^N) // 0x6b
T(dynamic) = sleb128(-25) i8(0) // 0x67

T : <fieldtype> -> i8*
T(<nat>:<datatype>) = leb128(<nat>) I(<datatype>)

T : <reftype> -> i8*
T(func (<datatype1>*) -> (<datatype2>*) <funcann>*) =
sleb128(-22) T*(<datatype1>*) T*(<datatype2>*) T*(<funcann>*) // 0x6a
sleb128(-22) T*(<datatype1>*) T*(<datatype2>*) T*(<funcann>*) // 0x6a
T(service {<methtype>*}) =
sleb128(-23) T*(<methtype>*) // 0x69

Expand Down Expand Up @@ -1180,6 +1244,7 @@ M(?v : opt <datatype>) = i8(1) M(v : <datatype>)
M(v* : vec <datatype>) = leb128(N) M(v : <datatype>)*
M(kv* : record {<fieldtype>*}) = M(kv : <fieldtype>)*
M(kv : variant {<fieldtype>*}) = leb128(i) M(kv : <fieldtype>*[i])
M(dynamic v:t : dynamic) = leb128(|B((0,v) : t)|) leb128(|R(v : t)|) B((0,v) : t)

M : (<nat>, <val>) -> <fieldtype> -> i8*
M((k,v) : k:<datatype>) = M(v : <datatype>)
Expand All @@ -1195,6 +1260,8 @@ M(ref(r) : principal) = i8(0)
M(id(v*) : principal) = i8(1) M(v* : vec nat8)
```

Note: The type `dynamic` is serialised as a nested, self-contained Candid blob, as defined by the meta-function `B` below (#parameters-and-results).


#### References

Expand All @@ -1211,6 +1278,7 @@ R(?v : opt <datatype>) = R(v : <datatype>)
R(v* : vec <datatype>) = R(v : <datatype>)*
R(kv* : record {<fieldtype>*}) = R(kv : <fieldtype>)*
R(kv : variant {<fieldtype>*}) = R(kv : <fieldtype>*[i])
R(v:t : dynamic) = R(v : t)

R : (<nat>, <val>) -> <fieldtype> -> <ref>*
R((k,v) : k:<datatype>) = R(v : <datatype>)
Expand Down Expand Up @@ -1265,12 +1333,14 @@ Deserialisation at an expected type sequence `(<t'>,*)` proceeds by

Deserialisation uses the following mechanism for robustness towards future extensions:

* A serialised type may be headed by an opcode other than the ones defined above (i.e., less than -24). Any such opcode is followed by an LEB128-encoded count, and then a number of bytes corresponding to this count. A type represented that way is called a *future type*.
* A serialised type may be headed by an opcode other than -1 to -24 . Any such opcode is followed by an LEB128-encoded count, and then a number of bytes corresponding to this count. A type represented that way is called a *future type*.

* A value corresponding to a future type is called a *future value*. It is represented by two LEB128-encoded counts, *m* and *n*, followed by a *m* bytes in the memory representation M and accompanied by *n* corresponding references in R.

These measures allow the serialisation format to be extended with new types in the future, as long as their representation and the representation of the corresponding values include a length prefix matching the above scheme, and thereby allowing an older deserialiser not understanding them to skip over them. The subtyping rules ensure that upgradability is maintained in this situation, i.e., an old deserialiser has no need to understand the encoded data.

The type `dynamic` is the only future type so far.


## Open Questions

Expand Down