-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(consensus): non-blocking proposal handling #1694
Conversation
Artifacts upload triggered. View details here |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1694 +/- ##
==========================================
+ Coverage 74.18% 77.20% +3.01%
==========================================
Files 359 368 +9
Lines 36240 38926 +2686
Branches 36240 38926 +2686
==========================================
+ Hits 26886 30053 +3167
+ Misses 7220 6624 -596
- Partials 2134 2249 +115
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 5 of 5 files at r1, all commit messages.
Reviewable status: all files reviewed, 8 unresolved discussions (waiting on @asmaastarkware)
crates/sequencing/papyrus_consensus/src/manager.rs
line 154 at r1 (raw file):
ShcReturn::Tasks(tasks) => { for task in tasks { shc_events.push(create_shc_event(task));
Delete create_shc_event
Suggestion:
shc_events.push(task.run());
crates/sequencing/papyrus_consensus/src/single_height_consensus.rs
line 43 at r1 (raw file):
Precommit(StateMachineEvent), BuildProposal(StateMachineEvent), // ProposalContentId should be signature validation.
Suggestion:
// TODO: Replace ProposalContentId with the invalidated signature from the proposer.
crates/sequencing/papyrus_consensus/src/single_height_consensus.rs
line 59 at r1 (raw file):
ProposalInit, oneshot::Receiver<ProposalContentId>, oneshot::Receiver<ProposalContentId>,
Suggestion:
oneshot::Receiver<ProposalContentId>, // Block built from the content.
oneshot::Receiver<ProposalContentId>, // Find sent by the proposer.
crates/sequencing/papyrus_consensus/src/single_height_consensus.rs
line 107 at r1 (raw file):
ShcEvent::BuildProposal(StateMachineEvent::GetProposal(Some(proposal_id), round)) } ShcTask::ValidateProposal(init, block_receiver, fin_receiver) => {
and update the variable names for proposal_id & fin
Suggestion:
ShcTask::ValidateProposal(init, built_from_content_receiver, fin_from_proposer_receiver) => {
crates/sequencing/papyrus_consensus/src/single_height_consensus.rs
line 179 at r1 (raw file):
/// Receive a proposal from a peer node. Returns only once the proposal has been fully received /// and processed.
This is no longer true. I think we should explain in the ShcTask enum the flow for build and validate:
Validate is received, Init is validated by SHC, then handed off to context and returned as ShcTask to have the Manager await without blocking consensus. Once complete this is returned to the SHC to be passed to the SM.
And also explain the build flow.
(obviously use full sentences)
Code quote:
/// Receive a proposal from a peer node. Returns only once the proposal has been fully received
/// and processed.
crates/sequencing/papyrus_consensus/src/single_height_consensus.rs
line 292 at r1 (raw file):
warn!("Round {} already has a proposal, ignoring", round); return Ok(ShcReturn::Tasks(Vec::new())); };
I think we want to move to a model where we store the ProposalInit, and then when this completes we add the ContentId.
The problem with the current model is that, now that we are non-blocking, we can just get spammed by a malicious node with proposals.
My idea is currently to be permanently blocking, so even if the proposal fails for some reason, this round is now blocked from receiving any proposal, as a defensive measure. Leave a TODO to consider removing the entry if this fails though.
Code quote:
let Entry::Vacant(proposal_entry) = self.proposals.entry(round) else {
warn!("Round {} already has a proposal, ignoring", round);
return Ok(ShcReturn::Tasks(Vec::new()));
};
crates/sequencing/papyrus_consensus/src/single_height_consensus.rs
line 307 at r1 (raw file):
StateMachineEvent::Proposal(proposal_id, round, valid_round), ) .await
Suggestion:
let id = if proposal_id != fin { None } else { proposal_id };
proposal_entry.insert(id);
self.process_inbound_proposal(
context,
StateMachineEvent::Proposal(id, round, valid_round),
)
.await
crates/sequencing/papyrus_consensus/src/single_height_consensus_test.rs
line 445 at r1 (raw file):
} async fn handle_proposal(
define above the tests
Code quote:
async fn handle_proposal(
13d790c
to
4fb0c18
Compare
Artifacts upload triggered. View details here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 3 files at r2, all commit messages.
Reviewable status: all files reviewed, 10 unresolved discussions (waiting on @asmaastarkware)
crates/sequencing/papyrus_consensus/src/single_height_consensus.rs
line 43 at r2 (raw file):
Precommit(StateMachineEvent), BuildProposal(StateMachineEvent), // TODO: Replace ProposalContentId with the invalidated signature from the proposer.
whoops (invalid = we checked and it's wrong; unvalidated = we didn't check yet)
Suggestion:
// TODO: Replace ProposalContentId with the unvalidated signature from the proposer.
crates/sequencing/papyrus_consensus/src/single_height_consensus.rs
line 59 at r2 (raw file):
ProposalInit, oneshot::Receiver<ProposalContentId>, // Block built from the content. oneshot::Receiver<ProposalContentId>, // Find sent by the proposer.
Just because I'm the native English speaker doesn't mean I don't have typos :)
Suggestion:
oneshot::Receiver<ProposalContentId>, // Fin sent by the proposer.
crates/sequencing/papyrus_consensus/src/single_height_consensus.rs
line 109 at r2 (raw file):
ShcTask::ValidateProposal( init, built_from_content_receiver,
Suggestion:
id_built_from_content_receiver,
crates/sequencing/papyrus_consensus/src/single_height_consensus.rs
line 185 at r2 (raw file):
/// SHC returns `ShcTask` to the manager, allowing it to wait for the block without blocking /// execution. Once validation is complete, the manager returns the block to the SHC, which /// processes it and hands it over to the SM.
WDYT of this phrasing?
I think it would be better though to have this comment on the enum variant for ValidateProposal
and here we can write
/// Process the proposal init and initiate block validation. See ShcTask::ValidateProposal
for more details on the full proposal flow.
Suggestion:
/// Proposals are handled in 3 stages:
/// 1. The SHC validates `Init`, then starts block validation within the context.
/// 2. SHC returns, allowing the context to validate the content while the Manager await the result without blocking consensus.
/// 3. Once validation is complete, the manager returns the built proposal to the SHC as an event, which can be sent to the SM.
crates/sequencing/papyrus_consensus/src/single_height_consensus.rs
line 219 at r2 (raw file):
// could spam us with multiple proposals. To prevent this, we currently ignore any // proposal that is not the first one received for a given round. proposal_entry.insert(None);
Since validating the proposal is non-blocking, we want to avoid validating the same round twice in parallel. This could be caused by a network repeat or a malicious spam attack.
Suggestion:
proposal_entry.insert(None);
crates/sequencing/papyrus_consensus/src/single_height_consensus.rs
line 298 at r2 (raw file):
) => { // TODO(matan): Switch to signature validation. let id = if proposal_id != fin { None } else { proposal_id };
add a warn log - unexpected, but possible due to a network issue
Code quote:
None
crates/sequencing/papyrus_consensus/src/single_height_consensus.rs
line 302 at r2 (raw file):
// another proposal for the same round. Suggestion: remove the entry from proposals // to allow this. // TODO (Asmaa): consider removing the entry from proposals or any other solution.
Retaining the entry for this round prevents us from receiving another proposal on this round. If the validations failed, which can be caused by a network issue, we may want to re-open ourselves to this round. The downside is that this may open us to a spam attack.
crates/sequencing/papyrus_consensus/src/single_height_consensus.rs
line 432 at r2 (raw file):
#[instrument(skip(self, context), level = "debug")] async fn handle_state_machine_get_proposal<ContextT: ConsensusContext>( &mut self,
See comment on handle_proposal
Code quote:
/// The SHC starts building the block within the context.
/// SHC returns `ShcTask` to the manager, allowing it to wait for the block without blocking
/// execution. Once building is complete, the manager returns the block to the SHC, which
/// processes it and hands it over to the SM.
#[instrument(skip(self, context), level = "debug")]
async fn handle_state_machine_get_proposal<ContextT: ConsensusContext>(
&mut self,
4fb0c18
to
632c06b
Compare
Artifacts upload triggered. View details here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r3, all commit messages.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @asmaastarkware)
crates/sequencing/papyrus_consensus/src/single_height_consensus.rs
line 298 at r2 (raw file):
Previously, matan-starkware wrote…
add a warn log - unexpected, but possible due to a network issue
I was explaining to you why this might occur, not the string we want.
"The proposal built from the content doesn't match the Fin: "(id, fin...)
crates/sequencing/papyrus_consensus/src/single_height_consensus.rs
line 60 at r3 (raw file):
/// without blocking consensus. /// 3. Once building is complete, the manager returns the built block to the SHC as an event, /// which can be sent to the SM.
Suggestion:
/// Building a proposal is handled in 3 stages:
/// 1. The SHC requests a block to be built from the context.
/// 2. SHC returns, allowing the context to build the block while the Manager awaits the result
/// without blocking consensus.
/// 3. Once building is complete, the manager returns the built block to the SHC as an event,
/// which can be sent to the SM.
/// During this process, the SM is frozen; it will accept and buffer other events, only processing them
/// once it receives the built proposal.
crates/sequencing/papyrus_consensus/src/single_height_consensus.rs
line 321 at r3 (raw file):
// may want to re-open ourselves to this round. The downside is that this may open // us to a spam attack. // TODO(Asmaa): discuss solution to this issue.
Suggestion:
// TODO(Asmaa): consider revisiting this decision. Spam attacks may not be a problem given
// that serial proposing anyways forces us to use interrupts.
632c06b
to
316e4d2
Compare
Artifacts upload triggered. View details here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r4, all commit messages.
Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @asmaastarkware)
No description provided.