Prevent gossip of duplicate exits, slashings, BLS #5375

michaelsproul · 2024-03-08T02:10:22Z

Issue Addressed

Some of the other clients have pulled us up for propagating duplicate exits, slashings and BLS to execution changes on gossip. The reason for this is that Lighthouse only checks for duplicates against an in-memory cache which is cleared on restart.

Proposed Changes

This PR fixes the issue by checking for duplicate exits/slashings/address-change messages against the head state. E.g. if a validator is already exited in the state then we consider new exits for this validator as duplicates, and similarly for other messages.

There are 2 broad approaches I considered when fixing this issue, with different trade-offs.

Option 1: check against head state

This is the approach taken by this PR. With flat (not tree) states its performance is:

CPU: O(1) per validator, indexing into the BeaconState.
Memory: O(1). No additional storage beyond the state is required.

However with tree-states these random indexing operations will become more expensive, at which point we may want to reconsider:

CPU: O(log n) per validator for indexing into the BeaconState.
Memory: O(1).

Option 2: restore the cache on startup

There's a TODO in the code that's been there forever about reinitialising some of these caches on startup:

lighthouse/beacon_node/beacon_chain/src/builder.rs

Lines 911 to 936 in b5bae6e

    
           // TODO: allow for persisting and loading the pool from disk. 
        
           naive_aggregation_pool: <_>::default(), 
        
           // TODO: allow for persisting and loading the pool from disk. 
        
           naive_sync_aggregation_pool: <_>::default(), 
        
           // TODO: allow for persisting and loading the pool from disk. 
        
           observed_attestations: <_>::default(), 
        
           // TODO: allow for persisting and loading the pool from disk. 
        
           observed_sync_contributions: <_>::default(), 
        
           // TODO: allow for persisting and loading the pool from disk. 
        
           observed_gossip_attesters: <_>::default(), 
        
           // TODO: allow for persisting and loading the pool from disk. 
        
           observed_block_attesters: <_>::default(), 
        
           // TODO: allow for persisting and loading the pool from disk. 
        
           observed_sync_contributors: <_>::default(), 
        
           // TODO: allow for persisting and loading the pool from disk. 
        
           observed_aggregators: <_>::default(), 
        
           // TODO: allow for persisting and loading the pool from disk. 
        
           observed_sync_aggregators: <_>::default(), 
        
           // TODO: allow for persisting and loading the pool from disk. 
        
           observed_block_producers: <_>::default(), 
        
           observed_blob_sidecars: <_>::default(), 
        
           observed_slashable: <_>::default(), 
        
           observed_voluntary_exits: <_>::default(), 
        
           observed_proposer_slashings: <_>::default(), 
        
           observed_attester_slashings: <_>::default(), 
        
           observed_bls_to_execution_changes: <_>::default(),

Rather than storing them on disk independently, we could reconstruct them from the head state. This has the advantage of providing O(1) indexing with tree-states, at the cost of slightly more memory:

CPU: O(1) for indexing into the HashSet inside ObservedOperations.
Memory: O(n) where n is the number of slashed/exited validators.

With tree-states we could also take a hybrid approach, e.g. using the slashings_cache introduced in #5279 to check for slashings, and initialising the exit cache on startup so that it indexes in O(1).

The complication would be making this work with reset_at_fork_boundary, which would probably have to do a full re-initialisation at each fork boundary (potentially very slow).

Additional Info

Thanks to @nisdas from Prysm for tracking these invalid messages down to Lighthouse peers, and for suggesting a fix (Prysm had a similar issue until recently and fixed it by using the head state).

jimmygchen

Nice and simple fix 👍 Looks good

michaelsproul · 2024-03-08T05:04:36Z

I'm now doubting whether this is necessary. AFAIK we should already be checking uniqueness in validate

Will mark this as WIP until we figure it out

michaelsproul · 2024-03-11T00:55:46Z

Closed in favour of:

Fix gossip verification of duplicate attester slashings #5385

Prevent gossip of duplicate exits, slashings, BLS

c67e1ae

michaelsproul added bug Something isn't working ready-for-review The code is ready for review v5.2.0 Q2 2024 Networking labels Mar 8, 2024

jimmygchen approved these changes Mar 8, 2024

View reviewed changes

jimmygchen added ready-for-merge This PR is ready to merge. and removed ready-for-review The code is ready for review labels Mar 8, 2024

michaelsproul added work-in-progress PR is a work-in-progress and removed ready-for-merge This PR is ready to merge. labels Mar 8, 2024

michaelsproul closed this Mar 11, 2024

michaelsproul mentioned this pull request Mar 11, 2024

Fix gossip verification of duplicate attester slashings #5385

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prevent gossip of duplicate exits, slashings, BLS #5375

Prevent gossip of duplicate exits, slashings, BLS #5375

michaelsproul commented Mar 8, 2024 •

edited

Loading

jimmygchen left a comment

michaelsproul commented Mar 8, 2024

michaelsproul commented Mar 11, 2024

	// TODO: allow for persisting and loading the pool from disk.
	naive_aggregation_pool: <_>::default(),
	// TODO: allow for persisting and loading the pool from disk.
	naive_sync_aggregation_pool: <_>::default(),
	// TODO: allow for persisting and loading the pool from disk.
	observed_attestations: <_>::default(),
	// TODO: allow for persisting and loading the pool from disk.
	observed_sync_contributions: <_>::default(),
	// TODO: allow for persisting and loading the pool from disk.
	observed_gossip_attesters: <_>::default(),
	// TODO: allow for persisting and loading the pool from disk.
	observed_block_attesters: <_>::default(),
	// TODO: allow for persisting and loading the pool from disk.
	observed_sync_contributors: <_>::default(),
	// TODO: allow for persisting and loading the pool from disk.
	observed_aggregators: <_>::default(),
	// TODO: allow for persisting and loading the pool from disk.
	observed_sync_aggregators: <_>::default(),
	// TODO: allow for persisting and loading the pool from disk.
	observed_block_producers: <_>::default(),
	observed_blob_sidecars: <_>::default(),
	observed_slashable: <_>::default(),
	observed_voluntary_exits: <_>::default(),
	observed_proposer_slashings: <_>::default(),
	observed_attester_slashings: <_>::default(),
	observed_bls_to_execution_changes: <_>::default(),

Prevent gossip of duplicate exits, slashings, BLS #5375

Prevent gossip of duplicate exits, slashings, BLS #5375

Conversation

michaelsproul commented Mar 8, 2024 • edited Loading

Issue Addressed

Proposed Changes

Option 1: check against head state

Option 2: restore the cache on startup

Additional Info

jimmygchen left a comment

Choose a reason for hiding this comment

michaelsproul commented Mar 8, 2024

michaelsproul commented Mar 11, 2024

michaelsproul commented Mar 8, 2024 •

edited

Loading