-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat: tracker uses unfinalised sc stream #4146
Conversation
PRO-923 Remove Block Delay in Ingress-Egress Tracker
One of the design choices for PRO-868 was to re-use as much as possible of the existing engine witnesser, and remove safety margin so that blocks are processed as fast as possible (which is the whole point of the tracker). I recently noticed that even without safety margin, engine's witnesser delays blocks by 20 seconds or so on average (specifically, it waits for chain tracking for that block +1 to have made it into SC's finalised state). This seems to be done to ensure we have the full list of deposit addresses before we process them. For tracking, however, this seems way too conservative. Most of the time, by the time users send funds to a deposit address, it will have already been recorded on SC (in an unfinalised block, at least). I imagine one potential solution is to somehow specialise tracker's witnesser obtain addresses from unfinalised SC storage and wait for unfinalised chain tracking for blocks. (Reorgs should be somewhat rare with Aura, right?) kylezs |
Codecov Report
@@ Coverage Diff @@
## main #4146 +/- ##
======================================
- Coverage 72% 72% -0%
======================================
Files 381 383 +2
Lines 62151 62349 +198
Branches 62151 62349 +198
======================================
+ Hits 44520 44635 +115
- Misses 15308 15377 +69
- Partials 2323 2337 +14
... and 3 files with indirect coverage changes 📣 Codecov offers a browser extension for seamless coverage viewing on GitHub. Try it in Chrome or Firefox today! |
Just realised this delay issue due to the SC finalised stream is present in the pre-witnessing too. Would be good to apply this change there as well (can be in a separate PR). (PRO-929) |
aec65ce
to
abe0832
Compare
StateChainStream: StateChainStreamApi<FINALIZED>, | ||
StateChainClient: StorageApi + Send + Sync + 'static, | ||
const FINALIZED: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
StateChainStream: StateChainStreamApi<FINALIZED>, | |
StateChainClient: StorageApi + Send + Sync + 'static, | |
const FINALIZED: bool, | |
StateChainStream: StateChainStreamApi, | |
StateChainClient: StorageApi + Send + Sync + 'static, |
You don't need to mention FINALIZED.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same with the other cases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean? I'm getting "missing generics for trait StateChainStreamApi
" error as expected.
I'm not going to be able to get to this for a few days, due to higher priority work (LP api). There is an important issue with the unfinalized stream basically that we shouldn't try to fill in gaps in the block stream. Also I'd like the initialization code to be cleaned up before it is merged as I feel it is adding significant tech debt to code that I put a great deal of effort into keeping clean, that most likely I'll have to clean up. Which is why I still wish to review this before it is merged. |
I was thinking about that too, and yesterday I was hoping to ask for your opinion on this actually. I know that we aren't supposed to have gaps in the unfinalised stream, but I'm also not sure that it hurts to keep that code in case we do miss a block for some reason. Either way, it is an easy change to make. I don't understand why you had to "work on a fix" for this in the background without any communication (and without any apparent progress on it either).
FWIW I don't think the existing code particularly "clean" and there are some things that I would change, but didn't because I want to focus on the feature and to avoid arguing about style again.
Not necessarily. I can clean it up too, if necessary. We have to agree of what "clean" means though. It is not just up to you. I already have some other code on top of this, factoring some code out and adding tests (which we need, and I wish I had while making changes in this PR). It just doesn't make sense dragging this PR out because of things that are likely going to change soon anyway...
Sounds like we should let you focus on that higher priority work. Now that you have expressed your main concerns, and provided that they are all addressed, wouldn't it be enough to get, say, @kylezs approval to get this merged? Your coding styles aren't too different I imagine, and I'm happy to make whatever changes he suggests to make it more "clean". |
async move { | ||
let base_rpc_client = &base_rpc_client; | ||
let intervening_headers: Vec<_> = | ||
futures::stream::iter(prev_header.number + 1..next_header.number) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The asserts below here can panic in the unfinalised case. Because it's possible that the chain changes in between these requests, there's no guarantee they remain the same between queries.
The original code is tailored around the accuracy case, of using the finalised stream, and ensuring that we don't have gaps, and that regardless of substrate weirdness we end up with a consistent, monotonic stream. So, naturally we end up having to put if finalized
in a bunch of places in the code - which, yes, would also fix the asserts
here too..
Instead, we should probably take the approach that gives us more certainty in the case we need it (the finalised case), and more flexibility in the unfinalised case - by separating these concerns entirely, into separate functions, rather than weaving in the two concerns (of accuracy and speed) with one another. Think this just makes it easier to reason about, can just think about one thing at a time.
There's also the question of whether catching up actually makes sense in the unfinalised case. Because the consumers of this stream access storage (and not events), then the fact that we skip some blocks doesn't really matter, and it doesn't make sense to include the complexity required to gain accuracy we need in the finalised case, in this case where we only want speed.
The only problematic case may be something like the underlying unfinalised stream skipping 24 hours+ of blocks, since this would go beyond the swapping channel expiry, which I think we can accept as quite unlikely - and not that bad if it did happen anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The asserts below here can panic in the unfinalised case. Because it's possible that the chain changes in between these requests, there's no guarantee they remain the same between queries.
Which assert in particular? These two shouldn't panic because we request block_header by hash (not by number), and we should then get a block where hash() equals the requested hash, even with unfinalised blocks. Also, the height of the block doesn't change even if its position gets taken over by a different block (otherwise its hash would have changed as well).
assert_eq!(
block_header.hash(),
block_hash,
"{SUBSTRATE_BEHAVIOUR}"
);
assert_eq!(
block_header.number, block_number,
"{SUBSTRATE_BEHAVIOUR}",
);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead, we should probably take the approach that gives us more certainty in the case we need it (the finalised case), and more flexibility in the unfinalised case - by separating these concerns entirely, into separate functions,
I don't mind this. I actually already extracted inject_intervening_blocks
on a separate branch (so that we can test it), which should also make it easier to do what you are suggesting here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's also the question of whether catching up actually makes sense in the unfinalised case.
All we need for the tracker is a stream of signals whenever there is a new block. I assumed we wanted to make use of unfinalised subscription elsewhere, where we might also be interested in events (e.g. P2P registrations). I'm happy to remove catching up from the unfinalised stream if you think that's the right thing to do. Perhaps we should actually write down for the requirements for this PR/stream are?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two shouldn't panic because we request block_header by hash (not by number), and we should then get a block where hash() equals the requested hash
True. In this case it can just error on the request stage, since we:
- Query by number, get the hash of this block
- Query by hash - hash doesn't exist so we error.
- This error then propogates, ultimately resulting in a shut down of the stream at the bottom of
create_block_subscription
scope.spawn_with_handle({
let base_rpc_client = base_rpc_client.clone();
async move {
loop {
let block_header =
block_header_stream.next().await.unwrap()?;
let block_hash = block_header.hash();
if let Some((required_version, _)) = required_version_and_wait {
let current_release_version = base_rpc_client.storage_value::<pallet_cf_environment::CurrentReleaseVersion<state_chain_runtime::Runtime>>(block_hash).await?;
if !required_version.is_compatible_with(current_release_version) {
break Err(anyhow!("This version '{}' is no longer compatible with the release version '{}' at block: {}", required_version, current_release_version, block_hash))
}
}
if !block_sender.send((block_hash, block_header)).await {
break Ok(())
}
if latest_block_hash_sender.send(block_hash).is_err() {
break Ok(())
}
}
}
}),
Though, yeah, we can just avoid a lot of this complexity and just have two functions, with different responsibilities. One to be effectively left as is (maybe some changes for some pieces of code that will be factored out), and then one separate one for the unfinalised.
P2P registrations
Perhaps we should also use storage for this too. We started moving away from the events in the witnessing refactor, which allows the components to be more isolated and self-contained, rather than depending on the SCO - which also helps us in the "uninline SCO" goal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we should also use storage for this too.
I don't mind that, and actually I have been thinking about that too. It will be a bit of work to compute the diff, but shouldn't be too bad and it would be easily testable.
c4c8294
to
117c370
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👌 yeah think this way is the way to go, just a few minor / testing comments
@@ -743,3 +781,145 @@ pub mod mocks { | |||
} | |||
} | |||
} | |||
|
|||
async fn inject_intervening_headers< |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add a comment here to state this this should only be used with a finalised stream and also that it can skip blocks etc. As there are a few gotchas here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a comment to only be used with finalized streams. What do you mean by "it can skip blocks" though?
move |hash| Ok(chain.headers.get(&hash).expect("unknown hash").clone()) | ||
}); | ||
|
||
let sparse_stream = tokio_stream::iter([0, 1, 3, 6].map(|num| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also add tests for some base cases like:
- No headers
- Headers already in order without gaps
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated. For "no headers" I changed the code to return error rather than panicking (for ease of testing). Arguably it should return an empty stream too, but doesn't seem like an important change to make, so I will leave it as is.
Pull Request
Closes: PRO-923
Checklist
Please conduct a thorough self-review before opening the PR.
Summary
As @AlastairHolmes suggested, I changed state chain client to return both finalised and unfinalised streams. Now
StateChainStreamApi
has a bool parameter to specify if the blocks should be finalized and it can be used to restrict certain api to only allow finalized streams.IngressAddresses
andEgressItems
are examples where both finalized and unfinalised streams are supported, and the tracker now uses unfinalized streams for them.Note that I had to change blanked impl of StateChainStreamApi and implement StateChainStreamApi directly on
InnerCachedStream
, because otherwiseFinalizedCachedStream
would also get impl for StateChainStreamApi (on top of StateChainStreamApi), which seemed less then ideal (though it would still work).