-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: deposit channel expiry #3998
Conversation
PRO-171 Ingress witnessing cooldown period
ProblemIn the ingress addresses adapter, there is a race condition between the witnessing of blocks and the expiry on the SC of ingress addresses, as the engine uses the SC state to determine what addresses to witness at a given block. But when the SC expires an address, all state for that address is removed. Therefore it is possible that different engines, are looking at different sets of addresses when witnessing ingresses for a particular block, This means the hashes of the witness calls may not match, leading to a stuck witness (One that doesnÄt reach 2/3). SolutionWe define a per-chain number of blocks that a particular egress channel will last for, So, when an ingress channel is opened, we store the Therefore, the DepositChannelDetails storage should only be deleted once the chain tracking for that chain has reached the |
1b5204e
to
85abcc0
Compare
85abcc0
to
9e5b4d2
Compare
Codecov Report
@@ Coverage Diff @@
## main #3998 +/- ##
======================================
- Coverage 72% 72% -0%
======================================
Files 370 373 +3
Lines 59306 59189 -117
Branches 59306 59189 -117
======================================
- Hits 42681 42507 -174
- Misses 14508 14577 +69
+ Partials 2117 2105 -12
... and 6 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
9e5b4d2
to
fccfe67
Compare
@@ -381,6 +396,45 @@ pub mod pallet { | |||
|
|||
#[pallet::hooks] | |||
impl<T: Config<I>, I: 'static> Hooks<BlockNumberFor<T>> for Pallet<T, I> { | |||
/// Recycle addresses if we can | |||
fn on_idle(_n: BlockNumberFor<T>, remaining_weight: Weight) -> Weight { | |||
let current_target_chain_height = T::ChainTracking::get_block_height(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These calculations are quite rough, didn't want to spend too much time on it because with the changes that @dandanlen is bringing in with the deposit tracking which will refactor a lot of the lifecycle stuff, it should be easier to restructure the code further to allow for a more efficient recycling of the channels.
2c23295
to
88e10f6
Compare
About this - we could/should provide an rpc to handle this. |
I assume you mean an RPC to allow querying of the number of blocks to wait before soft expiry for each chain? |
@@ -24,7 +24,6 @@ use tracing::log; | |||
#[derive(Serialize, Deserialize, Clone)] | |||
pub struct BrokerSwapDepositAddress { | |||
pub address: String, | |||
pub expiry_block: BlockNumber, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a breaking change to the api. SDK will probably need to be updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, linked the companion PR in the description
addresses.iter().filter(|details| details.opened_at <= index).cloned().collect() | ||
addresses | ||
.iter() | ||
.filter(|details| details.opened_at <= index && index <= details.expires_at) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'd prefer if we use an exclusive upper bound here.
This is more consistent with ranges in general (opened_at..expires_at).
Also it seems odd that we can submit something at the block in which it expires.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inclusive end is consistent with how we witness everything else, the external chain block number for the vault rotation, is witnessed by the previous authority set, i.e. to the previous set, it's inclusive of the range they witness, which is why I made it inclusive
// this too early. Even accounting for the safety margin, there is potential for | ||
// latency, full SC blocks which might block ingress witnesses getting in etc. By | ||
// having this buffer we protect against this probabilistically. | ||
if details.expires_at + channel_lifetime <= current_target_chain_height { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can save a read by using expires_at - opened_at to infer the lifetime of the channel. 🤓
|
||
let mut number_recycled = 0; | ||
for (_, details) in | ||
DepositChannelLookup::<T, I>::iter().take(number_to_recycle as usize) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this works as intended. It's possible we never recycle unless the remaining weight is enough to process all of the channels.
The order of iteration is based on the hashes of the keys, so we will always read them in the same order, and we might never reach the to-be-expired channel.
I think a better option is probably to store a vec of (block_number, address)
and then iterate these in order, stopping when we reach the current_target_chain_hegiht
or exceed the available weight budget.
Writing to this is cheap: We can use append
. Reading is more expensive but at least it's a single read operation. The decoding might be expensive but at least it will only contain the unexpired channels.
Btw. we don't remove the value from the DepositChannelLookup, it this intended?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah didn't consider the sort order, was intending to split them into active and expired sets, but that requires some of what you're refactoring I believe. For now I've done your suggestion so it's at least correct now.
Btw. we don't remove the value from the DepositChannelLookup, it this intended?
It was, but because there's a large enough gap between expiry and recycle block then the fetch will be able to occur in there, so I changed it to take(). I think eventually we will be able to make this more correct after your refactor.
IngressEgress::expire_channel(*address); | ||
} | ||
let recycle_block = BlockHeightProvider::<MockEthereum>::get_block_height() + | ||
DepositChannelLifetime::<Test, _>::get() * 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit magical again... maybe we can implement a method DepositChannelDetails::expiry_block(&self)
to factor this out?
44ba4a0
to
cdc5e0d
Compare
f812571
to
26b60b0
Compare
3c1b102
to
1ed2d1d
Compare
1ed2d1d
to
f2b2998
Compare
f2b2998
to
6312068
Compare
expiry is now done in the ingress-egress pallet
0dee99c
to
29f2fb6
Compare
* fix: SC recycles based on external chain block * chore: use genesis * feat: CFE filers based on expires_at * chore: add weight calculations for on_idle * chore: clean up some comments * chore: add genesis config * chore: use constants for expiry blocks * chore: use log_or_panic * fix: use correct expiry * test: more recycle tests * chore: clippy * chore: move constants into chainspec * fix: use mutate, rename * chore: ingress-egress pallet migrations * chore: use TargetChainBlockNumber * chore: remove the expiry storage from the swapping and lp pallets expiry is now done in the ingress-egress pallet * chore: update SDK version * chore: remove now-invalid bouncer test * chore: do the migrations --------- Co-authored-by: Daniel <daniel@chainflip.io>
Pull Request
Closes: PRO-171
SDK companion: chainflip-io/chainflip-sdk-monorepo#227 (required so that bouncer can pass)
Checklist
Please conduct a thorough self-review before opening the PR.
Summary
ChannelAction
storage in ingress-egress pallet. This was just used to ensure once the address is expired, that we can't deposit again, until it's explicitly reused. Instead we can just use theDepositChannelPool
directly which already has this information.DepositChannelLifetime
blocks more than the expiry block i.e.2 * DepositChannelLifetime
from channel creation. This ensures that any delay in witnesses getting into the SC after the CFE has submitted them is accounted for. Note that sending a tx to Ethereum at block 20 when the expires_at is block 19 will not go through. The CFEs will not witness it, even if the address has not yet been recycled.Next steps:
@dandanlen is working on some refactoring as part of the deposit tracking which will untangle some of the lifecycles from the storage items and transitions. This will make it easier to optimise this further. Potentially by having a separate storage item for the active addresses (any address the CFE might need to witness - it might not if the block number it is at is not within the range still), and one for the inactive addresses - these would be the addresses that have been recycled, ready for re-use, but the CFE definitely does not need witness. This saves on CFE compute and SC compute since we don't need to iterate through so many addresses on each on_idle call if we know we don't need to (in the case of inactive addresses.
The product (@fernandezdavid ) should use
expires_at + 1/2 * DepositChannelLifetime
this ensures that users are depositing long before the hard cut off. Maybe we could make it something like 3/4, but perhaps we want to start safer before expanding, will also be a per-chain configuration.