This document gives a top level view of the blsync PR in order to help the review/merge process. It describes the most important mechanisms and data structures and how they are used in this PR (sometimes also how they are going to be used later).
The current PR implements an MVP feature called blsync
that can light sync the beacon chain from a beacon node that supports the light_client
REST API namespace (Lodestar or Nimbus) and drive an EL node through the engine API. Its components are sometimes more general purpose though as they are also intended to be part of the new full featured PoS capable Geth light mode. Note that a significant part of this PR (more specifically light.LightChain
, merkle.MultiProof
, sync.HeaderSync
, sync.StateSync
) are only used here in order to get the finalized block hash out of the beacon state. It is a possible option to strip down the PR even further by removing this feature in the first version as it is maybe not essential and is implemented with a significant amount of code. On the other hand, it is still a nice to have feature and the exact same beacon state syncing mechanism is going to be an essential part of light servers so this feature is also a good test that helps us move toward the final goal.
This package defines passive data structures that represent the actual state of the beacon chain light syncing.
CheckpointData
is a starting point for light syncing (can be used to initialize aCommitteeChain
). It can be obtained based on the beacon header's root hash which is either hardcoded in the client or specified as a command line flag. It contains the sync committee of the given period and its beacon state merkle proof.CommitteeChain
holds a validated series ofSerializedCommittee
s andLightClientUpdate
s. It can validateSignedHeader
s once it has been synced up to the required sync period. It is a key component of beacon light clients but servers driven by full beacon nodes will also use it for storing and serving these structures. Though in the current PR a chain of sufficiently goodLightClientUpdate
s is never updated,CommitteeChain
is capable of replacing updates with better ones and even reorging if the better update proves a different next update (see comment atForwardUpdateSync
). Hopefully this feature will not be ever needed in practice (at least on mainnet), but still, propagating the best update chain is a good practice and and being able to recover from a serious attack reduces the potential feasibility of such an attack (note that currently the whole light syncing relies on an honest majority assumption so it is less safe than general consensus, though AFAIK there are serious ongoing efforts to make sync committee signature fraud slashable, at least for the finalized chain).LightChain
is a beacon header chain with optionally associated partial beacon state proofs which can be added separately, after the header has been added. It keeps track of the canonical header chain which can be externally set bySetHead
. It also automatically keeps track of the section of the canonical chain where state proofs are also available.HeadValidator
validatesSignedHeaders
with the currentCommitteeChain
and also implements a subscription mechanism that allows multiple subscriptions to new validated heads at different signer count levels. Note that currently we only have one subscription at the global signer threshold level (which is a command line parameter ofblsync
) but light servers will use separate subscriptions for propagating signed heads at signer count levels which are independent from the local threshold setting.
This package defines passive data structures used by the light syncing process.
SyncCommittee
is a set of 512 BLS keys randomly selected by consensus for every 8192 slot sync period. It is required for validating the BLS signature aggregates ofSignedHeader
s andLightClientUpdate
s.SerializedCommittee
is a serialized version ofSyncCommittee
.LightClientUpdate
proves the root hash of the sync committee of the next period based on a header signed by a sufficient majority of the sync committee of the given period, plus a beacon state merkle proof of thenext_sync_committee
state field. A light client update is better (has a higher update score) if the header signature aggregate has more participants. The best update is a finalized update that has a supermajority signed header referencing a former header from the same sync period as finalized.Forks
is a list of known chain forks that can determine theSigningRoot
of any header based on header hash and fork version at the given epoch.Header
is a beacon header.SignedHeader
is a header signed by a subset of the canonicalSyncCommittee
for the given period. Note that the structure does not reference the committee itself but the period is determined by theSignatureSlot
field.
This package is a framework for network requests and syncing mechanisms. In the final light client implementation it will replace some parts of the les
package (the request distributor/retriever).
Scheduler
is the main active component where sync modules and servers are registered. It implements a trigger mechanism that ensures that all sync modules get a chance of making network requests when necessary.Module
is an interface for a syncing module. These modules are called whenever triggered by module or server events. They typically have direct references to passive data structures (and sometimes other modules). In each processing round they determine whether they can add new data to the structures or start new network requests whose results can be added if successful. When changes have been made that might make other additions or requests possible, they emit module trigger signals, triggering themselves and/or other modules for the next processing round. TheirProcess
function always receives anEnvironment
which allows starting network requests and makes the current validated head and prefetch head available.Server
wraps the abstractRequestServer
(which is currently implemented bySyncServer
) and adds timing/triggering mechanisms for request timeout and delay. Delay is not used currently but will be used later by a greatly simplified version of the flow control. Whenever a server is found not available for requesting at any moment, it guarantees to send a server event trigger signal whenever it becomes available again.Environment
is always passed toModule.Process
and allows making network requests to the current set of servers (or a subset which has been recently triggered). It also makes the actual validated and prefetch heads available. The validated head is determined byHeadValidator
while the prefetch head comes fromHeadTracker
.HeadTracker
subscribes to the latest and signed head event streams of registered servers. Based on the latest heads it determines the current prefetch head which is the (possibly unvalidated) latest head advertised by the majority of servers. The signed head events are passed toHeadUpdater
(which passes them further toHeadValidator
whenCommitteeChain
can validate them)
This package contains sync modules (all of them implement request.Module
) that are not only used in the current PR but will also be used by the full-featured light client and/or its server.
CheckpointInit
checks if theCommitteeChain
is initialized. If not, it checks if the necessaryCheckpointData
is in the database. If not, it checks if it can start a request to retrieve it. Finally it initializesCommitteeChain
and emits a module trigger that startsForwardUpdateSync
.ForwardUpdateSync
checks if any of the servers, based on their advertised head slots, are supposed to haveLightClientUpdate
s that could be appended to the currentCommitteeChain
, then requests and adds them if successful. Note that when serving this data will be implemented, servers will also be able to advertise the update scores of their committee chain and there is going to be another sync method that compares the received scores to the local chain and fetches better updates if available.HeadUpdater
does not start any requests but is still a sync module so that it can be triggered wheneverCommitteeChain
is improved. All it does is that it receivesSignedHeader
s from the individual servers and passes them toHeadValidator
when theCommitteeChain
is synced.HeaderSync
tries to sync up the header chain ofLightChain
up to the current validated head (which is available throughEnvironment
). Once successful, it callsLightChain.SetHead
. Once the head is synced, it can also reverse sync the canonical header chain up to an externally set "tail target" slot. Optionally it can also attempt to fetch the prefetch head which is not made canonical yet but allows prefetching the state so that by the time the majority signature is available, all relevant data belonging to the head header is also available. Note that header prefetching is not used in theblsync
setup because it prefetches entire beacon blocks and the header is derived from those. Note that the current version always fetches headers one by one based on parent root while reverse syncing older headers could be paralellized by fetching by number and checking parent roots later. This will be added later as this is not essential for theblsync
setup which reverse syncs a few hundred slots at most.StateSync
fetches partial beacon state proofs with the specifiedCompactProofFormat
for all canonical headers ofLightChain
and also for the prefetch head.
This package implements request functions for the beacon node REST API. Note that in the final light client implementation execution layer requests will also be implemented here (at which point it might be moved under another package and will replace some parts of the current les
package).
BeaconLightApi
implements REST API requests.SyncServer
wrapsBeaconLightApi
and implementsrequest.RequestServer
. Note that it is going to have more function once the request delays are used.
This package implements merkle proof related tools. Note that these tools do not care about actual data structure definitions but rather about handling, merging, trimming arbitrarily shaped multiproofs. Also note that while they also serve a purpose here, they will make even more sense in the final light client where proof shapes are sometimes procedurally defined, for example when proving the execution block hash of a historical block, through the current beacon state, the historical_roots
tree, the old state_roots
tree of the given period, and finally the old beacon header and belonging beacon state. Or when servers are syncing up these historical structures from each other, retrieving range proofs for larger sections. On the other hand, they might currently be used unnecessarily in some cases, for example when hashing a Header
. In these cases (when there is a fixed and known data structure) github.com/protolambda/zrnt
can be used (I want to change this in the current PR).
ProofFormat
,ProofReader
,ProofWriter
are abstractions for arbitrarily shaped beacon state proofs.CompactProofFormat
is a proof format descriptor defined here and is used both for requesting and storing multiproofs. It is very compact as it only requires two bits per tree node (1/128th the size of the actual nodes) and is very easy to process.MultiProof
is a partial beacon state proof with aCompactProofFormat
and the corresponding list of tree nodes (Values
).
This package defines consensus constant parameters and beacon state field indices.
The main package of the blsync
executable. The main function creates the chain structures, sets up the scheduler and the sync modules, registers a SyncServer
for each beacon API URL specified in the command line and then starts the scheduler.
The two sync modules defined in this package (implement request.Module
) are only used by blsync
.
beaconBlockSync
retrieves full beacon blocks for the current validated and prefetch head (typically only the prefetch head if it gets validated later). When successful, it also extracts theHeader
from the block and adds it toLightChain
so that the state proofs can also be prefetched in the ideal case.engineApiUpdater
does not make any requests to the REST API but it calls the engine API whenever a new execution block is retrieved and validated.