Compute diff of validators list manually #6556
Draft
+382
−27
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Issue Addressed
From #5978 (review) I've seen that
xdelta3
struggles with very large inputs.The current approach of throwing the entire state to diff algorithm is nice because we don't need to deal with forks. However the state has only a few fields that are relevant for byte size
N = number of validators
Plotting the impact on total state size per field we get
So this PR explores handling the biggest offenders manually and leaving the rest to
xdelta3
Proposed Changes
Compute a special
ValidatorsDiff
which is:y
value if changed, or a default value if equal.(index, validator_diff)
to reduce the total input the compressor has to manage. This makes a huge difference in performance!Also re-use the existing logic for balance diffs for inactivity scores
Results
tree-states-archive
branchcommit f5b42e2
Times
compute hdiff n=1000000 v_mut=0 v_add=0 [35.187 ms 35.421 ms 35.843 ms]
compute hdiff n=1500000 v_mut=0 v_add=0 [54.489 ms 55.719 ms 58.521 ms]
compute hdiff n=2000000 v_mut=0 v_add=0 [73.075 ms 73.718 ms 75.247 ms]
compute hdiff n=1000000 v_mut=100 v_add=10 [342.67 ms 440.49 ms 598.40 ms]
compute hdiff n=1500000 v_mut=100 v_add=10 [518.29 ms 602.20 ms 743.59 ms]
compute hdiff n=2000000 v_mut=100 v_add=10 [649.77 ms 670.14 ms 700.15 ms]
compute hdiff n=1000000 v_mut=1000 v_add=100 [342.39 ms 353.71 ms 363.08 ms]
compute hdiff n=1500000 v_mut=1000 v_add=100 [506.88 ms 566.03 ms 643.94 ms]
compute hdiff n=2000000 v_mut=1000 v_add=100 [665.69 ms 814.25 ms 1.0424 s]
compute hdiff n=1000000 v_mut=100000 v_add=100 [1.0735 s 1.1561 s 1.2658 s]
compute hdiff n=1500000 v_mut=100000 v_add=100 [1.0565 s 1.1065 s 1.1676 s]
compute hdiff n=2000000 v_mut=100000 v_add=100 [1.6224 s 1.7003 s 1.8073 s]
Diff size
compute hdiff n=1000000 v_mut=1000 v_add=100 - 3,066,453
compute hdiff n=1500000 v_mut=1000 v_add=100 - 4,549,252
compute hdiff n=2000000 v_mut=1000 v_add=100 - 6,030,667
compute hdiff n=1000000 v_mut=100000 v_add=100 - 11,236,678
compute hdiff n=1500000 v_mut=100000 v_add=100 - 12,869,200
This branch
commit aaeffcc
compute hdiff n=1000000 v_mut=0 v_add=0 [25.360 ms 25.402 ms 25.484 ms]
compute hdiff n=1500000 v_mut=0 v_add=0 [38.702 ms 38.886 ms 39.273 ms]
compute hdiff n=2000000 v_mut=0 v_add=0 [52.190 ms 52.402 ms 52.895 ms]
compute hdiff n=1000000 v_mut=1000 v_add=100 [26.545 ms 26.652 ms 26.848 ms]
compute hdiff n=1500000 v_mut=1000 v_add=100 [39.236 ms 39.431 ms 39.772 ms]
compute hdiff n=2000000 v_mut=1000 v_add=100 [53.665 ms 53.904 ms 54.391 ms]
compute hdiff n=1000000 v_mut=100000 v_add=100 [102.26 ms 115.23 ms 145.19 ms]
compute hdiff n=1500000 v_mut=100000 v_add=100 [110.87 ms 112.32 ms 114.46 ms]
compute hdiff n=2000000 v_mut=100000 v_add=100 [116.55 ms 136.00 ms 176.70 ms]
Note: Storing all validator diffs increases compute time by 10x
Diff size
compute hdiff n=1000000 v_mut=1000 v_add=100 - 3,011,040
compute hdiff n=1500000 v_mut=1000 v_add=100 - 4,494,934
compute hdiff n=2000000 v_mut=1000 v_add=100 - 5,978,865