Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serve light client data #143

Open
etan-status opened this issue Oct 31, 2023 · 14 comments
Open

Serve light client data #143

etan-status opened this issue Oct 31, 2023 · 14 comments

Comments

@etan-status
Copy link

To enable users to start from an older block hash, light client data should be made available via Checkpointz.

The light client sync protocol allows to take a very old block root, and transform it into a recent one, without requiring the user to manually copy/paste checkpoint block roots around.

This is more secure than genesis sync, and also more secure than downloading a random finalized state from a third party without validation.

Required APIs that Checkpointz would have to cache and serve:

  1. /eth/v1/beacon/light_client/bootstrap/{block_root} - about 25 KB each
    • Has to be cached for each finalized checkpoint, including ones in the past. Should persist these for a really long time (months), these don't expire.
  2. /eth/v1/beacon/light_client/updates - about 25 KB each
    • This is a range query with start+count. Historical entries (older than 256 epochs) don't change. The latest entry may rarely change. It is good enough to only cache and make available historical entries (without the most recent one). Should persist these for a really long time (months), these don't expire.
    • Different clients may request different ranges of the data, and the cache should be prepared for that; e.g., cache each period separately so that requests from 20-40 and 15-35 can be served from the same database without requiring separate caches for each possible range.
  3. /eth/v1/beacon/light_client/finality_update - about 2 KB
    • This changes each time new finalized checkpoint block is reached. Can expire the cache every epoch (or every time finality changes), and only need to keep the most recent one around for this.

Bonus APIs for full light client data availability:

  1. /eth/v1/beacon/light_client/optimistic_update - about 1 KB
    • Changes every slot; not essential for checkpoint syncing but is a nice to have for light client syncing (e.g., web browser wallets).
    • Could be faked by using the finalized_update response and omitting finalized_header and finality_branch in the response. This could work around client limitations when only a partial light client API is available on the server
  2. /eth/v1/events
    • light_client_finality_update and light_client_optimistic_update
    • Push mechanism for finality_update and optimistic_update. Don't think checkpointz is used for that, but if eventstream is supported, would be great to also have access to these two topics. They provide the same data as /eth/v1/beacon/light_client/finality_update and /eth/v1/beacon/light_client/optimistic_update.
@etan-status
Copy link
Author

image

Example here based on Nimbus. The server here serves the light client data that this issue proposes should be added to checkpointz. That enables a secure sync experience without having to manually pass any block root or state root. The extra endpoints allow the server to proof that what it is sending as the checkpoint state is not malicious. That allows reducing the trust assumption on the server to simple data availability.

@samcm
Copy link
Member

samcm commented Mar 15, 2024

The main blocker for implementing this is that checkpointz doesn't store anything on disk, and is currently fairly light to run. Keen to investigate this.

@etan-status
Copy link
Author

Thanks for checking on this!

Hmm. Maybe it's possible to get away without disk? As in, just forwarding the /light_client endpoints to the backing beacon node transparently (without the /eth/v1/events endpoint). If necessary some rate limiting could be added in front of it.

Keep in mind that so far only Lodestar and Nimbus support the /light_client endpoint.

@samcm
Copy link
Member

samcm commented Mar 18, 2024

Yup acting as a proxy is certainly an option! Might make sense to check with current providers if that risk is acceptable, the security/load concerns are entirely restricted to Checkpointz atm 🙏

@philknows
Copy link

Only able to speak for the ChainSafe checkpoints here, but we'd be happy to try this out and keep an eye on our resource utilization change by enabling this as a proxy to our public nodes. If this feature is kept optional, maybe providers can just choose to enable if the endpoints are available depending on what clients they're running.

@etan-status
Copy link
Author

Another aspect to keep in mind is that the BN node behind the checkpointz server should ideally be genesis synced. Light client data cannot be reliably backfilled until the corresponding protocols are in place:

@philknows would be great if your public nodes have old data available! Also, there's a move to collect canonical data that is deterministic across the backing BN, which simplifies syncing in the future (includes test cases). This way, swapping the backing BN should not result in different data being served:

@etan-status
Copy link
Author

etan-status commented Apr 10, 2024

Might make sense to check with current providers if that risk is acceptable

With both Nimbus and Lodestar deeming it fine regarding the load of simply exposing the routes as a transparent proxy, I think it's worth giving this a shot.

Namely, the following routes should be proxied:

The following route would not be proxied at this time as it is more expensive and not strictly necessary for syncing:

@etan-status
Copy link
Author

@samcm Would it be possible to get the four light client routes (without events) exposed? Extra server load for this is minimal.

@samcm
Copy link
Member

samcm commented Oct 18, 2024

@etan-status I've made some progress on this, should have something to test next week :)

@samcm
Copy link
Member

samcm commented Oct 23, 2024

@etan-status How import is SSZ support for these endpoints? Is JSON ok for now?

@etan-status
Copy link
Author

To obtain SSZ, use HTTP Accept header, Accept: application/octet-stream selects SSZ format.

If the light client routes are proxied transparently to the backing server, the server will take care of both SSZ and JSON. It is very cheap for servers to provide answers to light client endpoints, they are direct lookups from a database without expensive operations.

@samcm
Copy link
Member

samcm commented Oct 23, 2024

Yeah we aren't blind proxying the response body - it's being parsed and then marshaled back to json (or SSZ), so we'd have to implement ssz for those types.

@etan-status
Copy link
Author

Generally, SSZ is about twice as efficient, as binary data won't be sent as hex strings. If possible, would be great if SSZ would be supported as well.

There's nothing special about the SSZ representation of these types that isn't already used by the /debug/states endpoint.

@samcm
Copy link
Member

samcm commented Oct 24, 2024

ethpandaops/checkpointz:0.0.6-light-client is ready for testing, just requires a config change to enable 🙏

e.g.

checkpointz:
  light_client:
    enabled: true
    mode: proxy

Definitely not mainnet ready yet. Also only supports JSON atm, unsure if I'll implement ssz at the moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants