Merge unstable deneb june 6th #4477

realbigsean · 2023-07-07T14:17:51Z

Issue Addressed

Which issue # does this PR address?

Proposed Changes

Please list or describe the changes introduced by this PR.

Additional Info

Please provide any additional information. For example, future considerations
or information useful for reviewers.

## Issue Addressed sigp#4314 ## Proposed Changes avoid relocking head during builder health check ## Additional Info NA

## Issue Addressed - sigp#4293 - sigp#4264 ## Proposed Changes *Changes largely follow those suggested in the main issue*. - Add new routes to HTTP API - `post_beacon_blocks_v2` - `post_blinded_beacon_blocks_v2` - Add new routes to `BeaconNodeHttpClient` - `post_beacon_blocks_v2` - `post_blinded_beacon_blocks_v2` - Define new Eth2 common types - `BroadcastValidation`, enum representing the level of validation to apply to blocks prior to broadcast - `BroadcastValidationQuery`, the corresponding HTTP query string type for the above type - ~~Define `_checked` variants of both `publish_block` and `publish_blinded_block` that enforce a validation level at a type level~~ - Add interactive tests to the `bn_http_api_tests` test target covering each validation level (to their own test module, `broadcast_validation_tests`) - `beacon/blocks` - `broadcast_validation=gossip` - Invalid (400) - Full Pass (200) - Partial Pass (202) - `broadcast_validation=consensus` - Invalid (400) - Only gossip (400) - Only consensus pass (i.e., equivocates) (200) - Full pass (200) - `broadcast_validation=consensus_and_equivocation` - Invalid (400) - Invalid due to early equivocation (400) - Only gossip (400) - Only consensus (400) - Pass (200) - `beacon/blinded_blocks` - `broadcast_validation=gossip` - Invalid (400) - Full Pass (200) - Partial Pass (202) - `broadcast_validation=consensus` - Invalid (400) - Only gossip (400) - ~~Only consensus pass (i.e., equivocates) (200)~~ - Full pass (200) - `broadcast_validation=consensus_and_equivocation` - Invalid (400) - Invalid due to early equivocation (400) - Only gossip (400) - Only consensus (400) - Pass (200) - Add a new trait, `IntoGossipVerifiedBlock`, which allows type-level guarantees to be made as to gossip validity - Modify the structure of the `ObservedBlockProducers` cache from a `(slot, validator_index)` mapping to a `((slot, validator_index), block_root)` mapping - Modify `ObservedBlockProducers::proposer_has_been_observed` to return a `SeenBlock` rather than a boolean on success - Punish gossip peer (low) for submitting equivocating blocks - Rename `BlockError::SlashablePublish` to `BlockError::SlashableProposal` ## Additional Info This PR contains changes that directly modify how blocks are verified within the client. For more context, consult [comments in-thread](sigp#4316 (comment)). Co-authored-by: Michael Sproul <michael@sigmaprime.io>

## Issue Addressed [sigp#4259](sigp#4259) ## Proposed Changes debounce spammy `Unable to send message to the beacon processor` log messages ## Additional Info We could potentially debounce other logs that have the potential to be "spammy". After some feedback we decided to additionally add the following change: create a newtype wrapper around `mpsc::Sender<BeaconWorkEvent<T>>`. When there is an error on the try_send method on the wrapper, we increase a counter metric with one label per work type.

## Issue Addressed [sigp#4292](sigp#4292) ## Proposed Changes Updated the node health endpoint will return a 200 status code if `!syncing && !el_offline && !optimistic` wil return a 206 if `(syncing || optimistic) && !el_offline` will return a 503 if `el_offline` ## Additional Info

…tion (sigp#4362) ## Issue Addressed sigp#4118 ## Proposed Changes This PR introduces a "progressive balances" cache on the `BeaconState`, which keeps track of the accumulated target attestation balance for the current & previous epochs. The cached values are utilised by fork choice to calculate unrealized justification and finalization (instead of converting epoch participation arrays to balances for each block we receive). This optimization will be rolled out gradually to allow for more testing. A new `--progressive-balances disabled|checked|strict|fast` flag is introduced to support this: - `checked`: enabled with checks against participation cache, and falls back to the existing epoch processing calculation if there is a total target attester balance mismatch. There is no performance gain from this as the participation cache still needs to be computed. **This is the default mode for now.** - `strict`: enabled with checks against participation cache, returns error if there is a mismatch. **Used for testing only**. - `fast`: enabled with no comparative checks and without computing the participation cache. This mode gives us the performance gains from the optimization. This is still experimental and not currently recommended for production usage, but will become the default mode in a future release. - `disabled`: disable the usage of progressive cache, and use the existing method for FFG progression calculation. This mode may be useful if we find a bug and want to stop the frequent error logs. ### Tasks - [x] Initial cache implementation in `BeaconState` - [x] Perform checks in fork choice to compare the progressive balances cache against results from `ParticipationCache` - [x] Add CLI flag, and disable the optimization by default - [x] Testing on Goerli & Benchmarking - [x] Move caching logic from state processing to the `ProgressiveBalancesCache` (see [this comment](sigp#4362 (comment))) - [x] Add attesting balance metrics Co-authored-by: Jimmy Chen <jimmy@sigmaprime.io>

We now officially have ipv6 support. The mainnet bootnodes have been updated to support ipv6. This PR updates lighthouse's internal bootnodes for mainnet to avoid fetching them on initial load.

## Issue Addressed NA ## Proposed Changes Bump versions ## Additional Info NA

## Issue Addressed sigp#4331 ## Proposed Changes - Use comparison rather than strict equality between the earliest epoch we know about and the backfill target (which will be the most recent WSP by default or genesis) - Add helper function `BackFillSync<T>::would_complete` to achieve this in one location ## Additional Info - There's an ad hoc test for this in sigp#4461 Co-authored-by: Age Manning <Age@AgeManning.com>

…m/realbigsean/lighthouse into merge-unstable-deneb-june-6th

## Issue Addressed [Users on Twitter](https://twitter.com/ashekhirin/status/1676334843192397824) are getting checkpoint sync URL timeouts with the default of 60s, so this PR increases the default timeout to 3 minutes. I've also added a short section to the book about adjusting the timeout with `--checkpoint-sync-url-timeout`.

*Replaces sigp#4434. It is identical, but this PR has a smaller diff due to a curated commit history.* ## Issue Addressed NA ## Proposed Changes This PR moves the scheduling logic for the `BeaconProcessor` into a new crate in `beacon_node/beacon_processor`. Previously it existed in the `beacon_node/network` crate. This addresses a circular-dependency problem where it's not possible to use the `BeaconProcessor` from the `beacon_chain` crate. The `network` crate depends on the `beacon_chain` crate (`network -> beacon_chain`), but importing the `BeaconProcessor` into the `beacon_chain` crate would create a circular dependancy of `beacon_chain -> network`. The `BeaconProcessor` was designed to provide queuing and prioritized scheduling for messages from the network. It has proven to be quite valuable and I believe we'd make Lighthouse more stable and effective by using it elsewhere. In particular, I think we should use the `BeaconProcessor` for: 1. HTTP API requests. 1. Scheduled tasks in the `BeaconChain` (e.g., state advance). Using the `BeaconProcessor` for these tasks would help prevent the BN from becoming overwhelmed and would also help it to prioritize operations (e.g., choosing to process blocks from gossip before responding to low-priority HTTP API requests). ## Additional Info This PR is intended to have zero impact on runtime behaviour. It aims to simply separate the *scheduling* code (i.e., the `BeaconProcessor`) from the *business logic* in the `network` crate (i.e., the `Worker` impls). Future PRs (see sigp#4462) can build upon these works to actually use the `BeaconProcessor` for more operations. I've gone to some effort to use `git mv` to make the diff look more like "file was moved and modified" rather than "file was deleted and a new one added". This should reduce review burden and help maintain commit attribution.

This reverts commit 246d52d.

## Proposed Changes * Add `lcli state-root` command for computing the hash tree root of a `BeaconState`. * Add a `--network` flag which can be used instead of `--testnet-dir` to set the network, e.g. Mainnet, Goerli, Gnosis. * Use the new network flag in `transition-blocks`, `skip-slots`, and `block-root`, which previously only supported mainnet. * **BREAKING CHANGE** Remove the default value of `~/.lighthouse/testnet` from `--testnet-dir`. This may have made sense in previous versions where `lcli` was more testnet focussed, but IMO it is an unnecessary complication and foot-gun today.

## Issue Addressed sigp#4494 ## Proposed Changes - Remove explicit re-exports of various types to appease the new compiler lint ## Additional Info It seems `warn(hidden_glob_reexports)` is the main culprit.

…rge-unstable-deneb-june-6th

## Issue Addressed Addresses issue: sigp#4459 ## Proposed Changes `cargo update -p proc-macro2` proc-macro2 v1.0.58 -> v1.0.63 ## Additional Info See: sigp#4459 / rust-lang/rust#113152

## Issue Addressed Fix an issue observed by `@zlan` on Discord where Lighthouse would sometimes return this error when looking up states via the API: > {"code":500,"message":"UNHANDLED_ERROR: ForkChoiceError(MissingProtoArrayBlock(0xc9cf1495421b6ef3215d82253b388d77321176a1dcef0db0e71a0cd0ffc8cdb7))","stacktraces":[]} ## Proposed Changes The error stems from a faulty assumption in the HTTP API logic: that any state in the hot database must have its block in fork choice. This isn't true because the state's hot database may update much less frequently than the fork choice store, e.g. if reconstructing states (where freezer migration pauses), or if the freezer migration runs slowly. There could also be a race between loading the hot state and checking fork choice, e.g. even if the finalization migration of DB+fork choice were atomic, the update could happen between the 1st and 2nd calls. To address this I've changed the HTTP API logic to use the finalized block's execution status as a fallback where it is safe to do so. In the case where a block is non-canonical and prior to finalization (permanently orphaned) we default `execution_optimistic` to `true`. ## Additional Info I've also added a new CLI flag to reduce the frequency of the finalization migration as this is useful for several purposes: - Spacing out database writes (less frequent, larger batches) - Keeping a limited chain history with high availability, e.g. the last month in the hot database. This new flag made it _substantially_ easier to test this change. It was extracted from `tree-states` (where it's called `--db-migration-period`), which is why this PR also carries the `tree-states` label.

## Issue Addressed When trying to run `eth1-sim` locally, the simulator doesn't start for me, and panicked due to duplicate arg names for `proposer-nodes` (using same arg names as `nodes`). Not sure why this isn't failing on CI but failing on mine 🤔 ``` thread 'main' panicked at 'Argument short must be unique thread 'main' panicked at 'Argument long must be unique ```

## Issue Addressed sigp#4488 ## Proposed Changes - Remove all instances of the `required` modifier where we have a default value specified for a subcommand ## Additional Info N/A

## Issue Addressed Fixes occasional compilation errors with mev-rs (see sigp#4456). ## Proposed Changes - Update `mev-rs` to the latest version, which allows us to remove hacky `[patch]` sections - Update the `axum` version used in `watch` so LH only uses a single version

## Proposed Changes Replace `wget` in the EF-tests makefile with `curl`. On macOS `curl` is pre-installed, and I found myself making this change to avoid installing `wget`. The `-L` flag is used to follow redirects which is useful if a repo gets renamed, and more similar to `wget`'s default behaviour.

## Issue Addressed Addresses an issue where CI could fail due to an nonexisting file error: ``` Run ./clean.sh rm: cannot remove '/home/runner/.lighthouse/local-testnet/geth_datadir4/geth/fastcache.tmp.1549331618': No such file or directory Error: Process completed with exit code 1. ``` This seems to happen quite frequently now, I'm not sure exactly why but perhaps worth trying suppressing the prompt? https://github.com/sigp/lighthouse/actions/runs/5455027574/jobs/9925916159?pr=4463

## Issue Addressed This PR attempts to workaround the recent frequent eth1 simulator failures caused by missing eth logs from Anvil. > FailedToInsertDeposit(NonConsecutive { log_index: 1, expected: 0 }) This usually occurs at the beginning of the tests, and it guarantees a timeout after a few hours if this log shows up, and this is currently causing our CIs to fail quite frequently. Example failure here: https://github.com/sigp/lighthouse/actions/runs/5525760195/jobs/10079736914 ## Proposed Changes The quick fix applied here adds a timeout to node startup and restarts the node again. - Add a 60 seconds timeout to beacon node startup in eth1 simulator tests. It takes ~10 seconds on my machine, but could take longer on CI runners. - Wrap the startup code in a retry function, that allows for 3 retries before returning an error. ## Additional Info We should probably raise an issue under the Anvil GitHub repo there so this can be further investigated.

## Issue Addressed N/A ## Proposed Changes Add lints for rust 1.71 [3789134](sigp@3789134) is probably the one that needs most attention as it changes beacon state code. I changed the `is_in_inactivity_leak ` function to return a `ArithError` as not all consumers of that function work well with a `BeaconState::Error`.

## Issue Addressed Speed up CI by installing foundry with Github action instead of building Anvil from source. Building anvil from source on GItHub hosted runners currently takes about 10 mins. Using the `foundry-toolchain` action to install only takes about 2 seconds.

…sts (sigp#4485) ## Issue Addressed n/a Noticed this while working on something else ## Proposed Changes - leverage the appropriate types to avoid a bunch of `unwrap` and errors ## Additional Info n/a

jimmygchen · 2023-07-17T07:53:08Z

@realbigsean
I made a quick attempt to merge latest unstable into this branch before I realized you may have already done some of the work on the 14th July branch!

The branch builds and passes lint, but I'm not sure about the correctness of the merge since it appeared to be a tricky one. I've added comments on places where I added additional code due to the BeaconProcessor refactor. Feel free to use it only if it helps!

realbigsean#37

…rge-unstable-deneb-jul-14

…albigsean/lighthouse into merge-unstable-deneb-jul-14

Merge unstable deneb jul 14

pawanjay176

LGTM. I mainly looked at changes in deneb stuff resulting from the merge and paid some extra attention to changes resulting from:

Broadcast validation PR
Changes in beacon processor because of moving it to a separate module

I'm satisfied with the changes. Amazing work with the changes, the conflicts would have been nasty.
Feel free to merge (i think there's one conflict remaining still)

pawanjay176 · 2023-07-17T18:21:29Z

beacon_node/beacon_chain/src/beacon_chain.rs

@@ -2798,6 +2831,9 @@ impl<T: BeaconChainTypes> BeaconChain<T> {
            notify_execution_layer,
        )?;

+        //TODO(sean) error handling?


Probably something that makes more sense in the original PR. I say we keep it as is here and change it only if required for something deneb specific.
Also, the errors are being handled at the calling site as well, so not too much of an issue imo.

Yea I think the thing here was I was wondering if I should pass the error through handle_block_error, but that method just prints out a log. I think publish_fn handles the logging fine, there's only a couple error cases and they're logged or returned via API errors

pawanjay176 · 2023-07-17T18:50:58Z

@realbigsean merged deneb-free-blobs to get CI going on this

int88 and others added 11 commits June 29, 2023 09:39

avoid relocking head during builder health check (sigp#4323)

23b06aa

## Issue Addressed sigp#4314 ## Proposed Changes avoid relocking head during builder health check ## Additional Info NA

Ipv6 bootnodes update (sigp#4394)

8e65419

We now officially have ipv6 support. The mainnet bootnodes have been updated to support ipv6. This PR updates lighthouse's internal bootnodes for mainnet to avoid fetching them on initial load.

Release v4.3.0 (sigp#4452)

dfcb336

## Issue Addressed NA ## Proposed Changes Bump versions ## Additional Info NA

remove into gossip verified block

246d52d

Merge branch 'remove-into-gossip-verified-block' of https://github.co…

cfe2452

…m/realbigsean/lighthouse into merge-unstable-deneb-june-6th

resolve merge issues

c4da1ba

realbigsean added the deneb label Jul 7, 2023

michaelsproul and others added 9 commits July 8, 2023 13:16

Revert "remove into gossip verified block"

6fd2ef4

This reverts commit 246d52d.

fix compile

782a53a

fix progressive balance slashing tests

57bb1d9

Remove hidden re-exports to appease Rust 1.73 (sigp#4495)

62c9170

## Issue Addressed sigp#4494 ## Proposed Changes - Remove explicit re-exports of various types to appease the new compiler lint ## Additional Info It seems `warn(hidden_glob_reexports)` is the main culprit.

fix tests

1599487

gossip validate blobs prior to publish

c016f5d

realbigsean added the ready-for-review The code is ready for review label Jul 12, 2023

realbigsean and others added 8 commits July 12, 2023 13:05

Merge branch 'unstable' of https://github.com/sigp/lighthouse into me…

a6f48f5

…rge-unstable-deneb-june-6th

bump proc-macro2 (sigp#4464)

0c7eed5

## Issue Addressed Addresses issue: sigp#4459 ## Proposed Changes `cargo update -p proc-macro2` proc-macro2 v1.0.58 -> v1.0.63 ## Additional Info See: sigp#4459 / rust-lang/rust#113152

fix merge conflict issues

42f54ee

fix merge

405e95b

Fix beta compiler warnings

1876082

Remove instances of required arguments with default values (sigp#4489)

1a5de8b

## Issue Addressed sigp#4488 ## Proposed Changes - Remove all instances of the `required` modifier where we have a default value specified for a subcommand ## Additional Info N/A

michaelsproul and others added 7 commits July 17, 2023 00:14

Cleanup unreachable code in lcli::generate_bootnode_enr and some te…

4435a22

…sts (sigp#4485) ## Issue Addressed n/a Noticed this while working on something else ## Proposed Changes - leverage the appropriate types to avoid a bunch of `unwrap` and errors ## Additional Info n/a

realbigsean and others added 6 commits July 17, 2023 09:33

Merge branch 'unstable' of https://github.com/sigp/lighthouse into me…

b96db45

…rge-unstable-deneb-jul-14

fixes after merge

0f514cb

fix smol lint

a618830

smol fixes

aeee5be

Merge branch 'merge-unstable-deneb-june-6th' of https://github.com/re…

597389c

…albigsean/lighthouse into merge-unstable-deneb-jul-14

Merge pull request #38 from realbigsean/merge-unstable-deneb-jul-14

27a9901

Merge unstable deneb jul 14

pawanjay176 approved these changes Jul 17, 2023

View reviewed changes

Merge branch 'deneb-free-blobs' into merge-unstable-deneb-june-6th

382b5ab

pawanjay176 and others added 2 commits July 17, 2023 13:29

Fix more beta compiler warnings

e1d0724

cargo fmt

cffa562

realbigsean merged commit f98671f into sigp:deneb-free-blobs Jul 17, 2023
7 checks passed

realbigsean deleted the merge-unstable-deneb-june-6th branch November 21, 2023 16:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge unstable deneb june 6th #4477

Merge unstable deneb june 6th #4477

realbigsean commented Jul 7, 2023

jimmygchen commented Jul 17, 2023

pawanjay176 left a comment

pawanjay176 Jul 17, 2023 •

edited

Loading

realbigsean Jul 17, 2023

pawanjay176 commented Jul 17, 2023

Merge unstable deneb june 6th #4477

Merge unstable deneb june 6th #4477

Conversation

realbigsean commented Jul 7, 2023

Issue Addressed

Proposed Changes

Additional Info

jimmygchen commented Jul 17, 2023

pawanjay176 left a comment

Choose a reason for hiding this comment

pawanjay176 Jul 17, 2023 • edited Loading

Choose a reason for hiding this comment

realbigsean Jul 17, 2023

Choose a reason for hiding this comment

pawanjay176 commented Jul 17, 2023

pawanjay176 Jul 17, 2023 •

edited

Loading