Onboarding is the process that happens when a new validateer is started and joins the set of validateers that have already been producing sidechain blocks for some time. On-boarding allows a new validateer to catch up on state and sidechain blocks to then start producing sidechain blocks itself.
In this scenario, validateer 1 has been producing sidechain blocks for some time and is then joined by validateer 2. The following diagram illustrates the sequence from the point of view of validateer 2 undergoing the onboarding process.
- On-boarding starts off with determining if there are other running validateers that have produced sidechain blocks. This is done by querying the parentchain about the author of the last sidechain block. As a result validateer 2 learns about the identity and URI of validateer 1.
- Note that in case there is no previous sidechain block and therefore no author, a validateer concludes that it's the first (or
primary
) validateer.
- Note that in case there is no previous sidechain block and therefore no author, a validateer concludes that it's the first (or
- Validateer 2 then uses this information to fetch the state encryption key and state from this peer (validateer 1).
- Once this is done, validateer 2 registers itself to the parent chain and becomes part of the active (i.e. block producing) validateer set.
- Upon running the first slot cycle and importing sidechain blocks from the queue, validateer 2 notices it's missing sidechain blocks. Again it communicates with its peer, validateer 1, to directly fetch these missing blocks. As explained above, validateer 1 stores a certain amount of sidechain blocks in a database. Once these peer-fetched blocks are transferred and imported, the importing from the regular import queue resumes.
- And with that, validateer 2 is fully synced and can continue with the regular block production cycle, producing blocks itself now, every second slot.
In the onboarding process, both 'state' and sidechain blocks are transferred from one validateer to another. What is the difference and why are both required? In theory, all state information is contained in the sidechain blocks. In order to know what the current state is, all transactions/operations in the sidechain blocks from genesis can be computed and accumulated. This however means that computing the current state becomes more time-consuming as the blockchain grows. That is where the 'state' comes into play. The 'state' is a snapshot of the blockchain at a certain time (or block number). For example in a blockchain with 100
blocks, a state snapshot might be taken at block number 95
. So in order to know the latest state (at block number 100
), the state snapshot is taken and the last 5
blocks are applied to this state. This dramatically reduces the amount of computation required to arrive at the latest state.
In the current configuration, a sidechain state snapshot is taken after each sidechain block. So why do we need to transfer the state and then still fetch missing sidechain blocks? Because in between the time of fetching the state snapshot and starting the sidechain block production cycle, more sidechain blocks were already produced. Furthermore, only a certain number (currently 100
) of sidechain blocks are cached in each validateer. This cache size has to be large enough to bridge the gap between receiving the state snapshot and starting the block production cycle. A cache size of 100
blocks and a block production time of 300ms
means that a gap of maximum 30s
can be bridged, which imposes an upper limit to the duration of the onboarding process.
\