In these past two years, I wrote posts about blockchains, Ethereum, and the RSK project. Many times, I wrote about a personal experiment, with personal projects, or about a proposal, experiments using open source published code. And many times, I needed additional information in the Ethereum/RSK blockchain. An example: I should write about supporting more than a kind of virtual machine, or the support of colored ether, but in such proposal, a need appears: having extra information in a Transacion or into an Account state.
Currently, Ethereum projects and derivates,, like Ethereum/RSK, encoded the core entities using RLP (Run Lenght Prefix) encoding, specified in the Yellow Paper. Using a dedicated encoding ensures that every implementation obtains the same hash derived from the encoded format of an entity. The hash of a block header is derived from its encoded representation.
Internally, each entity is serialized to a list of RLP encoded items. If an entity, an AccountState, has two internal fields, balance and nonce, then the encoded representation is an RLP list with length 2.
My proposal is: if you need an additional field o fields into a core entity, it should be discussed in the project community, as any other improvement proposal. But the way to implement, any future extension, is the same. Take as an example the above described AccountState. Initially, its encoded form has TWO RLP items in a list. When you received the encoded bytes, RLP can determine THE NUMBER OF items. If the number of items is TWO, then the encoded representation is the original one. BUT if the number of items is GREATER THAN TWO, the third fields should be a byte (or two bytes) DECLARING THE VERSION of the encoding. Maybe, version 1 for AccountState is the way to say that the FOURTH item is the color of the balance, and version 2 indicates that the FIFTH item represents another field added later, etc….
When you serialized the AccountState, if the additional fields have their default version, you only serialize TWO ITEMS, WITHOUT version field. In this way, the encoded is the same than the original, and it is a normal form that avoid the lost of consensus (ie another node serializing the AccountState with THREE fiekds, with version 0; such serialization should be forbidden). So, any AccountState (any combination of its internal state) has a NORMAL FORM of serialization. If the first additional field has a non-default value, and the second one has a default value, then the version included in serialization should be 1 (ONE). A version could have many added fields. My example only discuss the addition of a field in version 1 and another one in version 2.
This serialization strategy could be applied to the core entities like
- Block Headers
- Account States
- Transaction Receipts
And when a new version is included, it does not affect the previous serialized entities.
I should write about some application of these ideas.