Skip withdrawn validators in withdrawal sweep #3528

ppopth · 2023-10-24T07:25:38Z

Motivation

Since, when the validator index grows large, it's slow to withdraw the deposit, we tried to make it faster by reusing the indices in ethereum/EIPs#6914 (consensus-specs at #3307).

However, reusing the indices brings complexity to the protocol and the spec. This PR skips the withdrawn validators in the withdrawal sweeping process to make the withdrawal faster while not reusing the validator indices.

The good thing about not reusing the validator indices is that we have a one-to-one correspondence between the validator indices and the validator public keys.

This improvement is intended to be an alternative to EIP-6914. So this improvement can be applied only if we don't want to do EIP-6914.

Implementation detail

In the current spec, we know that get_expected_withdrawals will not run more than MAX_VALIDATORS_PER_WITHDRAWALS_SWEEP iterations. This improvement changes that function and makes it able to run more iterations than before.

Clients can optimize that by having a cache of unskippable validators and sweeping through the cache instead of the set of all validators. The running time will be still in the order of MAX_VALIDATORS_PER_WITHDRAWALS_SWEEP.

Since, when the validator index grows large, it's slow to withdraw the deposit, we tried to make it faster by reusing the indices in EIP-6914. However, reusing the indices brings complexity to the protocol and the spec. This commit instead skips the nonpotential withdrawn validators in the withdrawl sweep to make the withdrawal faster. The function `get_expected_withdrawals` is modified in a way that it will not always run in the order of `MAX_VALIDATORS_PER_WITHDRAWALS_SWEEP` anymore, but clients SHOULD implement this function in a way that keeps the running time to be still in the order of `MAX_VALIDATORS_PER_WITHDRAWALS_SWEEP` by having a cache of unskippable validators. This improvement is intended to be an alternative to EIP-6914, so this can be applied only if we don't want to go with EIP-6914.

dapplion · 2023-10-24T10:39:19Z

This PR does not address the main goals of EIP-6914. The key constraint on the beacon-chain is that the full state must be loaded in memory and offer fast iteration over all indices. I would recommend to read this WIP draft on the engineering details of EIP-6914 https://hackmd.io/@dapplion/eip6914 but the main goals are:

limit the max memory cost a single state, and related data-structures
bound the computational complexity of epoch transitions

Addressing the specific implementation of this PR it defeats the original goal of the withdrawal bounded sweep. The worst case of this PR is consensus clients having to iterate the majority of the validators registry every block.

ppopth · 2023-10-24T14:29:16Z

bound the computational complexity of epoch transitions

The worst case of this PR is consensus clients having to iterate the majority of the validators registry every block.

I guess what you mean is that, if the majority of the validators are withdrawn long ago, the clients have to iterate through them every block.

If the clients are implemented as it is done in the spec, yes it's true. But the main idea of this PR is to encourage the clients to have the cache of unskippable validators and iterate through that cache instead of the set of all validators. By doing this the number of iterations is not more than MAX_VALIDATORS_PER_WITHDRAWALS_SWEEP.

So the computational complexity is not different from Capella, except that this PR iterates through more potential validators than Capella.

The code in the spec is implemented in a slower way just for simplicity purpose.

I understand that the clients have to have a different algorithm from the one specified in the spec, but, as far as I understand, we are okay with it. (for example, the full-sweep strategy in #3307)

ppopth · 2023-10-24T15:27:15Z

The key constraint on the beacon-chain is that the full state must be loaded in memory

limit the max memory cost a single state, and related data-structures

Do we have a requirement that the beacon state must fit in the memory? If so and the state grows rapidly, yeah reusing the validator indices makes sense and is better.

If not and we are allowed to have some parts of the beacon state to be in the memory and some parts to be in the storage, the state size may not be the issue.

However, I think the state size requirement is counter-intuitive to me because the EL state is already large (smart contract state, etc.). So having the beacon state large may not be an issue, including that the beacon state may not grow as fast as the EL state.

dapplion · 2023-10-25T05:25:00Z

However, I think the state size requirement is counter-intuitive to me because the EL state is already large (smart contract state, etc.). So having the beacon state large may not be an issue, including that the beacon state may not grow as fast as the EL state.

The nature of EL vs CL state is very different, see below a summary comparison (from https://hackmd.io/@dapplion/eip6914#Why-is-this-a-problem)

	Execution state	Consensus state
lookups	by hash	by index
reads per block	very sparse	all state is read at least once per epoch
total serialized size	many GBs	many MBs

The fact the CL state lookups are done by index, and the need to do complete iterations "frequently" (once per epoch) restricts the design space. Note that none of those things happen in EL clients.

But the main idea of this PR is to encourage the clients to have the cache of unskippable validators and iterate through that cache instead of the set of all validators.

If I recall, part of the appeal of the partial sweep is that it allows clients to not add any cache at all. Caches have been a good source of consensus failures, so not needing one is a positive. Of course clients make extensive use of caches today, just pointing out the positives of cache reduction.

ppopth · 2023-10-25T06:38:43Z

If I recall, part of the appeal of the partial sweep is that it allows clients to not add any cache at all. Caches have been a good source of consensus failures, so not needing one is a positive. Of course clients make extensive use of caches today, just pointing out the positives of cache reduction.

I'm not sure what the partial sweep means, but I guess it's the current Capella sweep.

The full-sweep approach in EIP-6914 requires a cache as well, right? Otherwise, get_index_for_new_validator will be very slow, as shown in #3307.

So I think the cache stuff is not the advantage that EIP-6914 has over this PR.

potuz · 2023-10-26T10:29:22Z

Although @dapplion already mentioned both these points I'd like to stress them again:

We are forced to keep the full beacon state in memory, not one, but several in fact. There are techniques to keep one full state plus a bunch of differences to others, but even with those techniques the amount of memory consumed is still several beacon states worth (with the advantage that many more can be recovered quickly).
Caches are pretty bad and have caused most of the consensus problems we see on devnets/testnets. Most of our day is currently spent reviewing edge cases around caches. The code complexity of having to deal with an extra cache, specially for this loop that is very cheap, does not make it worth it.

I would suggest we hold on to any approach of dealing with this until this becomes a real UX problem (or at least is closed to becoming one). Blocks are going to be packed with 16 withdrawals for the foreseeable future. When they are not, we can always increase substantially the sweep size, but realistically, when we get there we will have to have implemented index reuse or other means to bound the state in memory which is the real problem

ppopth force-pushed the skip-withdrawn branch from 7dc99af to c6d4831 Compare October 24, 2023 07:31

ppopth force-pushed the skip-withdrawn branch from c6d4831 to 206d135 Compare October 24, 2023 07:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Skip withdrawn validators in withdrawal sweep #3528

Skip withdrawn validators in withdrawal sweep #3528

ppopth commented Oct 24, 2023 •

edited

Loading

dapplion commented Oct 24, 2023

ppopth commented Oct 24, 2023 •

edited

Loading

ppopth commented Oct 24, 2023 •

edited

Loading

dapplion commented Oct 25, 2023 •

edited

Loading

ppopth commented Oct 25, 2023 •

edited

Loading

potuz commented Oct 26, 2023

Skip withdrawn validators in withdrawal sweep #3528

Are you sure you want to change the base?

Skip withdrawn validators in withdrawal sweep #3528

Conversation

ppopth commented Oct 24, 2023 • edited Loading

Motivation

Implementation detail

dapplion commented Oct 24, 2023

ppopth commented Oct 24, 2023 • edited Loading

ppopth commented Oct 24, 2023 • edited Loading

dapplion commented Oct 25, 2023 • edited Loading

ppopth commented Oct 25, 2023 • edited Loading

potuz commented Oct 26, 2023

ppopth commented Oct 24, 2023 •

edited

Loading

ppopth commented Oct 24, 2023 •

edited

Loading

ppopth commented Oct 24, 2023 •

edited

Loading

dapplion commented Oct 25, 2023 •

edited

Loading

ppopth commented Oct 25, 2023 •

edited

Loading