LIP: 0064
Title: Disallow non-required properties in Lisk codec
Author: Andreas Kendziorra <andreas.kendziorra@lightcurve.io>
Discussions-To: https://research.lisk.com/t/disallow-non-required-properties-in-lisk-codec/341
Status: Active
Type: Standards Track
Created: 2022-04-12
Updated: 2024-01-04
Replaces: 0027
Requires: 0055
This proposal defines a stricter version of Lisk codec - the serialization method introduced in LIP 0027. It is stricter by requiring that every schema property must be marked as required. This simplifies the rules significantly and makes the serialization method much less error-prone without implying any change in the current Lisk SDK implementation.
This LIP is licensed under the Creative Commons Zero 1.0 Universal.
The generic serialization method defined in LIP 0027, also referred to as Lisk codec, allows to have properties in a Lisk JSON schema that are NOT marked as required. The reason was to allow a single schema to be used to serialize an object with and without a signature property. For example, there is only one block header schema in LIP 0029 that is used for serializing unsigned block headers (needed for computing block header signatures) and for serializing signed block headers (needed, for example, for computing block IDs). However, the implementation of the Lisk SDK 5 does not make use of this flexibility. Instead, every time two different serialization methods (with and without signature) are needed, two different schemas are used, where in each schema every property was marked as required (one schema simply does not contain the signature property - see the block header schemas). Moreover, this flexibility comes with a couple of downsides:
- It is error-prone: LIP 0027 encourages to mark every property in a schema as required, as otherwise unexpected things like
encode(decode( binaryMsg )) != binaryMsg
for a binary messagebinaryMsg
can happen. But some developers may unintentionally miss this recommendation or may intentionally ignore it without foreseeing all consequences. - Lisk codec is supposed to be compatible with proto2 in the sense that protobuf implementations with the adequate
.proto
file deserialize valid binary messages correctly. However, this could not be achieved to 100% as even the proto2 specifications with regard to decoding missing optional fields (corresponds to non-required properties in Lisk codec) are not fixed. This is because it is not defined what the default value for message fields (corresponds to objects in Lisk codec) is. For proto3, it is mentioned that this is even language specific. - The rules for Lisk codec and the implementation are significantly more complex.
For these reasons, we propose to make the rules for Lisk codec more strict in the way that every property in a Lisk JSON schema must be marked as required. This will imply in particular that:
encode(decode( binaryMsg )) == binaryMsg
holds for every valid binary messagebinaryMsg.
- Every protobuf implementation will decode a valid binary message correctly.
- The rules, especially for decoding, become simpler.
The serialization and deserialization method defined in this document works as defined in LIP 0027 with exceptions and differences specified in the following subsections.
We make the definition of Lisk JSON schema more strict: A Lisk JSON schema is a JSON schema as defined in the Lisk JSON Schemas section in LIP 0027 with the additional requirement that
- every schema of type object must use the
required
keyword, and every property of the object must be contained in its value.
See below for examples.
The encoding rules are exactly the same as in LIP 0027.
The decoding rules become simpler compared to LIP 0027:
Only valid messages can be decoded (note that point 3.iii in the definition of valid binary message becomes obsolete with this proposal). For invalid binary messages, decoding fails. When a valid binary message is parsed, the binary message is decoded according to the proto2 specifications. In particular, whenever a binary message does not contain a specific field of type array, the corresponding field in the parsed object is set to the empty array. This holds for arrays using packed encoding as well as for arrays using non-packed encoding.
Implementations following this proposal are not totally backwards compatible with the one defined in LIP 0027 and implemented in Lisk SDK 5. That means, Lisk JSON schemas that are valid in the sense of LIP 0027 and that have properties not marked as required cannot be used for encoding and decoding. However, it is backwards compatible with valid Lisk JSON schemas in which every property is marked as required.
Conversely, every Lisk JSON schema valid in accordance with this proposal is also valid with LIP 0027, and the encoding and decoding rules are the same.
From the Lisk JSON schemas used in the protocol of Lisk SDK 5, only the block header schema is not valid with respect to this proposal. But this one is supposed to be superseded by LIP 0055 in which two schemas are used. One with the block header signature, the other one without.
Update lisk-codec to disallow additional or missing data while decoding
The root schema does not use the required
keyword:
{
"type": "object",
"properties": {
"foo": {
"dataType": "uint32",
"fieldNumber": 1
},
"bar": {
"dataType": "uint32",
"fieldNumber": 2
}
}
}
Not all properties of the root schema are listed in the value of required
:
{
"type": "object",
"required": ["foo"]
"properties": {
"foo": {
"dataType": "uint32",
"fieldNumber": 1
},
"bar": {
"dataType": "uint32",
"fieldNumber": 2
}
}
}