-
-
Notifications
You must be signed in to change notification settings - Fork 289
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
De-duplicate payload from persisted beacon blocks #5671
Comments
Initial thoughts are that DB growth over time should only be affected by finalized blocks so limiting modification to Existing parts of code that interact with
|
To recap a bit: Block DB input/output paths:
Storage schemaUse same format / strategy for archive and hot DB. Use the first byte of the payload are version byte. This allows to make the migration optional, or not do it at all.
Inserting blockAfter this change blocks must always be inserted as blinded. In the import flow, we can compute the execution header from the struct value which has cached hashing. Then merge those bytes with the serialized payload and persist Serving blocksFor API and ReqResp requests:
For Regen replay:
|
Should there be a cli flag to turn the feature on? |
if easy to implement, it's a good to have in case there are issues in the future |
We can tell if a serialized execution payload is blinded or not by looking at the extra_data offset value. So no need to prefixes in the DB
|
@matthewkeil I've done a sketch of how this feature could be implemented, can you take a look if this approach make sense to you? https://github.com/ChainSafe/lodestar/compare/dapplion/dedup-payloads?expand=1 |
Awesome! I will get this implemented when i switch back to this task. Should be this sprint.
Yep. Looks good @dapplion!! Is very similar to how I was doing on my work branch. I will read through your changes carefully and makes sure I limit the changes to just what you recommended. I found there were some places that the types do not line up when moving to FullOrBlindedBeaconBlock for the two Repositories but those changes were pretty minimal. I'll message you when I start on this work again and will let you know if i have any questions as I go. |
Note to self: Make sure that #5923 still works correctly during PR process |
@dapplion here are the perf results from doing the splicing with and without deserializing the block first. The test file to check methodology is here: fullOrBlindedBlock
BlindedOrFull to full
phase0
✔ phase0 to full - deserialize first 16947.43 ops/s 59.00600 us/op - 9119 runs 0.606 s
✔ phase0 to full - convert serialized 2985075 ops/s 335.0000 ns/op - 1989539 runs 1.01 s
altair
✔ altair to full - deserialize first 10005.90 ops/s 99.94100 us/op - 3021 runs 0.410 s
✔ altair to full - convert serialized 3076923 ops/s 325.0000 ns/op - 1226301 runs 0.606 s
bellatrix
✔ bellatrix to full - deserialize first 6555.443 ops/s 152.5450 us/op - 9250 runs 1.57 s
✔ bellatrix to full - convert serialized 2450980 ops/s 408.0000 ns/op - 1043460 runs 0.606 s
capella
✔ capella to full - deserialize first 6236.319 ops/s 160.3510 us/op - 3144 runs 0.678 s
✔ capella to full - convert serialized 2469136 ops/s 405.0000 ns/op - 1035528 runs 0.606 s
BlindedOrFull to blinded
phase0
✔ phase0 to blinded - deserialize first 17687.53 ops/s 56.53700 us/op - 6073 runs 0.404 s
✔ phase0 to blinded - convert serialized 9523810 ops/s 105.0000 ns/op - 2525364 runs 0.505 s
altair
✔ altair to blinded - deserialize first 9639.483 ops/s 103.7400 us/op - 8749 runs 1.01 s
✔ altair to blinded - convert serialized 9708738 ops/s 103.0000 ns/op - 5107628 runs 1.01 s
bellatrix
✔ bellatrix to blinded - deserialize first 96.84429 ops/s 10.32585 ms/op - 82 runs 1.35 s
✔ bellatrix to blinded - convert serialized 98.84780 ops/s 10.11656 ms/op - 53 runs 1.04 s
capella
✔ capella to blinded - deserialize first 47.96520 ops/s 20.84845 ms/op - 21 runs 0.949 s
✔ capella to blinded - convert serialized 47.11033 ops/s 21.22677 ms/op - 36 runs 1.27 s |
@matthewkeil what is the latest status of this feature? |
Problem description
Since the merge, both execution and Lodestar beacon nodes persist the block's execution payload into the DB.
At an average block size of 100Kb, that's about 720 MB / day or 263 GB / year of redundant data we don't really need to store. See https://ycharts.com/indicators/ethereum_average_block_size According to metrics, current Lodestar DB growth averaged over the last 30 days on a mainnet node without validator is 666 MB/day.
Solution description
Instead Lodestar should persist in its DB blinded blocks, and retrieve from the execution node the payloads on demand to comply with:
All of this operations are not super time sensitive so the added latency is not a breaking deal
Additional context
No response
The text was updated successfully, but these errors were encountered: