Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure lookup sync checks caches correctly #5840

Merged
merged 3 commits into from
May 25, 2024

Conversation

dapplion
Copy link
Collaborator

Issue Addressed

Sync lookup needs to have a view of what's currently being processed to prevent double downloads and not get lookups stuck.

Part of an issue in #5833 was caused because blob requests don't check into the processing cache (reqresp_pre_import_cache). If the block is already processing, blob requests should continue.

To prevent futures issues like this I think we should encapsulate the details of the beacon_chain caches to there. And then expose only the minimum information sync needs to know with a simple enum BlockProcessStatus.

Proposed Changes

  • Add get_block_process_status to not leak cache details into sync lookup
  • When attempting a blob_lookup_request, consider blocks in NotValidated state as download and continue the download of blobs

@dapplion dapplion added the ready-for-review The code is ready for review label May 24, 2024
@@ -401,9 +392,20 @@ impl<T: BeaconChainTypes> SyncNetworkContext<T> {
downloaded_block_expected_blobs: Option<usize>,
) -> Result<LookupRequestResult, RpcRequestSendError> {
let Some(expected_blobs) = downloaded_block_expected_blobs.or_else(|| {
self.chain
.data_availability_checker
.num_expected_blobs(&block_root)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can delete num_blobs_expected

@realbigsean
Copy link
Member

a couple lookup tests are broken, but i think the tests are wrong? the scenarios looks like they're testing that we don't trigger a block or blob lookup if we have a block in the processing cache, when really we just want to suppress the block lookup

@realbigsean
Copy link
Member

fixed the pr comments here: 2b55017

@realbigsean
Copy link
Member

Something I feel like I'm missing - is this addressing a bug that actually causes a lookup to get stuck? it looks like it's just an optimization where we are requesting blobs sooner

In the extreme case -> all attestations arrive while the block is in the processing cache, so none arrive to trigger a lookup when it hits the DA cache. When the block eventually hits the DA cache, we should get a missing components that causes this lookup to progress, right? or is there a separate bug preventing this?

@realbigsean realbigsean added waiting-on-author The reviewer has suggested changes and awaits thier implementation. and removed ready-for-review The code is ready for review labels May 24, 2024
@dapplion
Copy link
Collaborator Author

The bug this PR is fixing is the blob requests not checking the block processing cache to figure out if the block is already downloaded.

Consider a case where:

  • The block request did not request the block because it's in the processing cache, so it's in AwaitingDownload state
  • The block is not in the da_checker

The blob request won't continue, because it's waiting for the block to download. However, if the block is not actually processing the request may be stuck.

I think the root issue is that during import the block is removed from the da_checker but not from the processing cache until later.

However, I think blob_request should consider blocks that are both on the da_checker and processing_cache to be asymmetric with block_request logic. Otherwise, I think this can be a source of inconsistency if you sandwich other bugs.

@realbigsean
Copy link
Member

Thanks for the explanation :)

@realbigsean
Copy link
Member

@mergify queue

Copy link

mergify bot commented May 25, 2024

queue

✅ The pull request has been merged automatically

The pull request has been merged automatically at e498421

@realbigsean
Copy link
Member

@mergify refresh

Copy link

mergify bot commented May 25, 2024

refresh

✅ Pull request refreshed

mergify bot added a commit that referenced this pull request May 25, 2024
mergify bot added a commit that referenced this pull request May 25, 2024
@mergify mergify bot merged commit e498421 into sigp:unstable May 25, 2024
28 checks passed
@dapplion dapplion deleted the block-processing-status branch May 28, 2024 22:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
waiting-on-author The reviewer has suggested changes and awaits thier implementation.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants