Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: implement BufferPool for PersistentCPStateCache #6269

Merged
merged 6 commits into from
Jan 18, 2024

Conversation

twoeths
Copy link
Contributor

@twoeths twoeths commented Jan 9, 2024

Motivation

  • As in serializeState.test.ts benchmark, memory allocation cost is too much while we have memory allocation demand to serialize the whole state at every epoch
  • part of feat: consume new state cache apis #6250 with smaller diff to make it easier to review

Description

  • Implement BufferPool to improve state serialization and also state reload
    • define a GROW_RATIO to alloc more memory once when we reach limitation
    • metrics show that the use of it does not increase heap memory
  • Implement a new DataStore using fs for debugging purpose, will add a new flag to configure it in the next PR
  • This does not affect current functionality as new state caches are not used anywhere in the production code

part of #5968

this.stateCache.add(postState);
this.checkpointStateCache.processState(blockRootHex, postState).catch((e) => {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the current CheckpointStateCache implementation does nothing on this call

/**
* Prune or persist checkpoint states in an epoch, see the description in `processState()` function
*/
private async processPastEpoch(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nothing new for this function, it's moved from processState() to make that function shorter

Copy link
Contributor

github-actions bot commented Jan 9, 2024

Performance Report

✔️ no performance regression detected

🚀🚀 Significant benchmark improvement detected

Benchmark suite Current: 7a5d83d Previous: 9eb9cce Ratio
forkChoice updateHead vc 600000 bc 64 eq 300000 17.801 ms/op 62.507 ms/op 0.28
getExpectedWithdrawals 250000 eb:1,eth1:1,we:0,wn:0,smpl:15 7.2960 us/op 26.646 us/op 0.27
getExpectedWithdrawals 250000 eb:0.95,eth1:0.7,we:0.05,wn:0,smpl:18 7.2640 us/op 27.636 us/op 0.26
Full benchmark results
Benchmark suite Current: 7a5d83d Previous: 9eb9cce Ratio
getPubkeys - index2pubkey - req 1000 vs - 250000 vc 794.56 us/op 791.59 us/op 1.00
getPubkeys - validatorsArr - req 1000 vs - 250000 vc 79.151 us/op 103.01 us/op 0.77
BLS verify - blst-native 1.2629 ms/op 1.3680 ms/op 0.92
BLS verifyMultipleSignatures 3 - blst-native 2.6556 ms/op 2.8373 ms/op 0.94
BLS verifyMultipleSignatures 8 - blst-native 5.8221 ms/op 6.3555 ms/op 0.92
BLS verifyMultipleSignatures 32 - blst-native 21.318 ms/op 23.450 ms/op 0.91
BLS verifyMultipleSignatures 64 - blst-native 41.937 ms/op 46.093 ms/op 0.91
BLS verifyMultipleSignatures 128 - blst-native 83.303 ms/op 91.733 ms/op 0.91
BLS deserializing 10000 signatures 912.12 ms/op 958.74 ms/op 0.95
BLS deserializing 100000 signatures 9.2733 s/op 9.7453 s/op 0.95
BLS verifyMultipleSignatures - same message - 3 - blst-native 1.3311 ms/op 1.5295 ms/op 0.87
BLS verifyMultipleSignatures - same message - 8 - blst-native 1.5022 ms/op 1.6159 ms/op 0.93
BLS verifyMultipleSignatures - same message - 32 - blst-native 2.3267 ms/op 2.4445 ms/op 0.95
BLS verifyMultipleSignatures - same message - 64 - blst-native 4.4930 ms/op 3.9249 ms/op 1.14
BLS verifyMultipleSignatures - same message - 128 - blst-native 5.7811 ms/op 6.1466 ms/op 0.94
BLS aggregatePubkeys 32 - blst-native 27.019 us/op 28.447 us/op 0.95
BLS aggregatePubkeys 128 - blst-native 100.11 us/op 110.03 us/op 0.91
getAttestationsForBlock 43.718 ms/op 52.539 ms/op 0.83
getSlashingsAndExits - default max 172.19 us/op 168.34 us/op 1.02
getSlashingsAndExits - 2k 383.75 us/op 500.70 us/op 0.77
proposeBlockBody type=full, size=empty 5.1087 ms/op 6.3014 ms/op 0.81
isKnown best case - 1 super set check 313.00 ns/op 375.00 ns/op 0.83
isKnown normal case - 2 super set checks 296.00 ns/op 400.00 ns/op 0.74
isKnown worse case - 16 super set checks 294.00 ns/op 436.00 ns/op 0.67
CheckpointStateCache - add get delete 4.9950 us/op 6.3900 us/op 0.78
validate api signedAggregateAndProof - struct 2.7873 ms/op 3.0515 ms/op 0.91
validate gossip signedAggregateAndProof - struct 2.7760 ms/op 3.0198 ms/op 0.92
validate gossip attestation - vc 640000 1.3666 ms/op 1.4890 ms/op 0.92
batch validate gossip attestation - vc 640000 - chunk 32 168.75 us/op 181.00 us/op 0.93
batch validate gossip attestation - vc 640000 - chunk 64 149.73 us/op 184.28 us/op 0.81
batch validate gossip attestation - vc 640000 - chunk 128 134.43 us/op 157.37 us/op 0.85
batch validate gossip attestation - vc 640000 - chunk 256 132.96 us/op 138.16 us/op 0.96
pickEth1Vote - no votes 1.2320 ms/op 1.3260 ms/op 0.93
pickEth1Vote - max votes 8.5860 ms/op 13.394 ms/op 0.64
pickEth1Vote - Eth1Data hashTreeRoot value x2048 15.614 ms/op 21.799 ms/op 0.72
pickEth1Vote - Eth1Data hashTreeRoot tree x2048 25.400 ms/op 30.195 ms/op 0.84
pickEth1Vote - Eth1Data fastSerialize value x2048 637.32 us/op 648.50 us/op 0.98
pickEth1Vote - Eth1Data fastSerialize tree x2048 6.4064 ms/op 7.0413 ms/op 0.91
bytes32 toHexString 527.00 ns/op 547.00 ns/op 0.96
bytes32 Buffer.toString(hex) 295.00 ns/op 293.00 ns/op 1.01
bytes32 Buffer.toString(hex) from Uint8Array 473.00 ns/op 438.00 ns/op 1.08
bytes32 Buffer.toString(hex) + 0x 292.00 ns/op 290.00 ns/op 1.01
Object access 1 prop 0.16700 ns/op 0.16600 ns/op 1.01
Map access 1 prop 0.15700 ns/op 0.15200 ns/op 1.03
Object get x1000 7.2240 ns/op 8.0580 ns/op 0.90
Map get x1000 0.76300 ns/op 0.79600 ns/op 0.96
Object set x1000 49.588 ns/op 53.724 ns/op 0.92
Map set x1000 39.502 ns/op 42.656 ns/op 0.93
Return object 10000 times 0.23760 ns/op 0.25040 ns/op 0.95
Throw Error 10000 times 3.8810 us/op 3.9591 us/op 0.98
fastMsgIdFn sha256 / 200 bytes 3.2680 us/op 3.4080 us/op 0.96
fastMsgIdFn h32 xxhash / 200 bytes 286.00 ns/op 294.00 ns/op 0.97
fastMsgIdFn h64 xxhash / 200 bytes 357.00 ns/op 346.00 ns/op 1.03
fastMsgIdFn sha256 / 1000 bytes 11.313 us/op 12.100 us/op 0.93
fastMsgIdFn h32 xxhash / 1000 bytes 417.00 ns/op 459.00 ns/op 0.91
fastMsgIdFn h64 xxhash / 1000 bytes 427.00 ns/op 445.00 ns/op 0.96
fastMsgIdFn sha256 / 10000 bytes 103.62 us/op 105.99 us/op 0.98
fastMsgIdFn h32 xxhash / 10000 bytes 1.9360 us/op 2.0250 us/op 0.96
fastMsgIdFn h64 xxhash / 10000 bytes 1.3240 us/op 1.4000 us/op 0.95
send data - 1000 256B messages 18.377 ms/op 20.806 ms/op 0.88
send data - 1000 512B messages 23.846 ms/op 27.238 ms/op 0.88
send data - 1000 1024B messages 41.576 ms/op 42.930 ms/op 0.97
send data - 1000 1200B messages 39.707 ms/op 38.623 ms/op 1.03
send data - 1000 2048B messages 48.201 ms/op 55.818 ms/op 0.86
send data - 1000 4096B messages 42.478 ms/op 49.729 ms/op 0.85
send data - 1000 16384B messages 117.23 ms/op 118.16 ms/op 0.99
send data - 1000 65536B messages 416.22 ms/op 556.71 ms/op 0.75
enrSubnets - fastDeserialize 64 bits 1.2590 us/op 1.7200 us/op 0.73
enrSubnets - ssz BitVector 64 bits 418.00 ns/op 620.00 ns/op 0.67
enrSubnets - fastDeserialize 4 bits 172.00 ns/op 278.00 ns/op 0.62
enrSubnets - ssz BitVector 4 bits 419.00 ns/op 771.00 ns/op 0.54
prioritizePeers score -10:0 att 32-0.1 sync 2-0 111.72 us/op 158.24 us/op 0.71
prioritizePeers score 0:0 att 32-0.25 sync 2-0.25 137.77 us/op 174.28 us/op 0.79
prioritizePeers score 0:0 att 32-0.5 sync 2-0.5 187.09 us/op 246.50 us/op 0.76
prioritizePeers score 0:0 att 64-0.75 sync 4-0.75 328.30 us/op 425.84 us/op 0.77
prioritizePeers score 0:0 att 64-1 sync 4-1 368.00 us/op 431.71 us/op 0.85
array of 16000 items push then shift 1.6333 us/op 2.0011 us/op 0.82
LinkedList of 16000 items push then shift 8.9880 ns/op 11.875 ns/op 0.76
array of 16000 items push then pop 85.267 ns/op 131.59 ns/op 0.65
LinkedList of 16000 items push then pop 8.8090 ns/op 12.246 ns/op 0.72
array of 24000 items push then shift 2.4999 us/op 2.8680 us/op 0.87
LinkedList of 24000 items push then shift 9.5070 ns/op 10.988 ns/op 0.87
array of 24000 items push then pop 154.22 ns/op 198.24 ns/op 0.78
LinkedList of 24000 items push then pop 8.7900 ns/op 11.729 ns/op 0.75
intersect bitArray bitLen 8 6.6570 ns/op 7.9110 ns/op 0.84
intersect array and set length 8 67.152 ns/op 109.27 ns/op 0.61
intersect bitArray bitLen 128 34.347 ns/op 42.096 ns/op 0.82
intersect array and set length 128 941.18 ns/op 1.3022 us/op 0.72
bitArray.getTrueBitIndexes() bitLen 128 1.6460 us/op 2.1540 us/op 0.76
bitArray.getTrueBitIndexes() bitLen 248 2.7010 us/op 3.5600 us/op 0.76
bitArray.getTrueBitIndexes() bitLen 512 5.5960 us/op 8.6730 us/op 0.65
Buffer.concat 32 items 1.0780 us/op 1.4250 us/op 0.76
Uint8Array.set 32 items 1.9290 us/op 2.6590 us/op 0.73
Set add up to 64 items then delete first 4.3413 us/op 6.0124 us/op 0.72
OrderedSet add up to 64 items then delete first 5.4768 us/op 8.1531 us/op 0.67
Set add up to 64 items then delete last 4.6553 us/op 6.4095 us/op 0.73
OrderedSet add up to 64 items then delete last 5.8845 us/op 9.0354 us/op 0.65
Set add up to 64 items then delete middle 4.6776 us/op 6.8116 us/op 0.69
OrderedSet add up to 64 items then delete middle 7.5810 us/op 10.682 us/op 0.71
Set add up to 128 items then delete first 9.9068 us/op 12.387 us/op 0.80
OrderedSet add up to 128 items then delete first 13.593 us/op 16.667 us/op 0.82
Set add up to 128 items then delete last 9.4236 us/op 11.479 us/op 0.82
OrderedSet add up to 128 items then delete last 12.917 us/op 19.986 us/op 0.65
Set add up to 128 items then delete middle 10.348 us/op 12.675 us/op 0.82
OrderedSet add up to 128 items then delete middle 18.197 us/op 27.400 us/op 0.66
Set add up to 256 items then delete first 19.577 us/op 25.894 us/op 0.76
OrderedSet add up to 256 items then delete first 26.171 us/op 34.277 us/op 0.76
Set add up to 256 items then delete last 18.992 us/op 25.513 us/op 0.74
OrderedSet add up to 256 items then delete last 23.923 us/op 42.812 us/op 0.56
Set add up to 256 items then delete middle 18.578 us/op 24.615 us/op 0.75
OrderedSet add up to 256 items then delete middle 45.224 us/op 66.267 us/op 0.68
transfer serialized Status (84 B) 1.9060 us/op 2.5780 us/op 0.74
copy serialized Status (84 B) 1.5910 us/op 2.1110 us/op 0.75
transfer serialized SignedVoluntaryExit (112 B) 2.0720 us/op 2.5620 us/op 0.81
copy serialized SignedVoluntaryExit (112 B) 1.7430 us/op 2.3970 us/op 0.73
transfer serialized ProposerSlashing (416 B) 3.1190 us/op 3.2320 us/op 0.97
copy serialized ProposerSlashing (416 B) 3.0430 us/op 3.3670 us/op 0.90
transfer serialized Attestation (485 B) 3.2930 us/op 3.8990 us/op 0.84
copy serialized Attestation (485 B) 3.0480 us/op 3.7870 us/op 0.80
transfer serialized AttesterSlashing (33232 B) 3.2840 us/op 5.5450 us/op 0.59
copy serialized AttesterSlashing (33232 B) 7.4950 us/op 12.540 us/op 0.60
transfer serialized Small SignedBeaconBlock (128000 B) 3.6790 us/op 4.6560 us/op 0.79
copy serialized Small SignedBeaconBlock (128000 B) 16.404 us/op 40.181 us/op 0.41
transfer serialized Avg SignedBeaconBlock (200000 B) 4.2010 us/op 5.7870 us/op 0.73
copy serialized Avg SignedBeaconBlock (200000 B) 35.218 us/op 53.403 us/op 0.66
transfer serialized BlobsSidecar (524380 B) 4.0320 us/op 6.2430 us/op 0.65
copy serialized BlobsSidecar (524380 B) 87.580 us/op 336.52 us/op 0.26
transfer serialized Big SignedBeaconBlock (1000000 B) 4.1060 us/op 14.067 us/op 0.29
copy serialized Big SignedBeaconBlock (1000000 B) 167.56 us/op 492.89 us/op 0.34
pass gossip attestations to forkchoice per slot 4.4384 ms/op 5.1405 ms/op 0.86
forkChoice updateHead vc 100000 bc 64 eq 0 702.32 us/op 1.0162 ms/op 0.69
forkChoice updateHead vc 600000 bc 64 eq 0 4.9129 ms/op 8.6248 ms/op 0.57
forkChoice updateHead vc 1000000 bc 64 eq 0 7.3423 ms/op 11.635 ms/op 0.63
forkChoice updateHead vc 600000 bc 320 eq 0 4.3284 ms/op 5.4436 ms/op 0.80
forkChoice updateHead vc 600000 bc 1200 eq 0 4.5234 ms/op 7.2623 ms/op 0.62
forkChoice updateHead vc 600000 bc 7200 eq 0 5.5352 ms/op 7.7341 ms/op 0.72
forkChoice updateHead vc 600000 bc 64 eq 1000 11.368 ms/op 15.114 ms/op 0.75
forkChoice updateHead vc 600000 bc 64 eq 10000 12.062 ms/op 16.043 ms/op 0.75
forkChoice updateHead vc 600000 bc 64 eq 300000 17.801 ms/op 62.507 ms/op 0.28
computeDeltas 500000 validators 300 proto nodes 6.8781 ms/op 8.3379 ms/op 0.82
computeDeltas 500000 validators 1200 proto nodes 6.8345 ms/op 8.4451 ms/op 0.81
computeDeltas 500000 validators 7200 proto nodes 6.3943 ms/op 8.1976 ms/op 0.78
computeDeltas 750000 validators 300 proto nodes 9.7047 ms/op 12.277 ms/op 0.79
computeDeltas 750000 validators 1200 proto nodes 10.064 ms/op 12.609 ms/op 0.80
computeDeltas 750000 validators 7200 proto nodes 10.195 ms/op 11.994 ms/op 0.85
computeDeltas 1400000 validators 300 proto nodes 18.937 ms/op 23.369 ms/op 0.81
computeDeltas 1400000 validators 1200 proto nodes 18.531 ms/op 24.034 ms/op 0.77
computeDeltas 1400000 validators 7200 proto nodes 19.323 ms/op 24.347 ms/op 0.79
computeDeltas 2100000 validators 300 proto nodes 28.040 ms/op 32.664 ms/op 0.86
computeDeltas 2100000 validators 1200 proto nodes 28.292 ms/op 32.968 ms/op 0.86
computeDeltas 2100000 validators 7200 proto nodes 27.721 ms/op 34.412 ms/op 0.81
computeProposerBoostScoreFromBalances 500000 validators 3.7513 ms/op 4.5806 ms/op 0.82
computeProposerBoostScoreFromBalances 750000 validators 3.5985 ms/op 4.6154 ms/op 0.78
computeProposerBoostScoreFromBalances 1400000 validators 3.5298 ms/op 4.7396 ms/op 0.74
computeProposerBoostScoreFromBalances 2100000 validators 3.6065 ms/op 4.5402 ms/op 0.79
altair processAttestation - 250000 vs - 7PWei normalcase 2.2261 ms/op 5.0228 ms/op 0.44
altair processAttestation - 250000 vs - 7PWei worstcase 3.2910 ms/op 6.1000 ms/op 0.54
altair processAttestation - setStatus - 1/6 committees join 178.37 us/op 281.64 us/op 0.63
altair processAttestation - setStatus - 1/3 committees join 325.92 us/op 459.03 us/op 0.71
altair processAttestation - setStatus - 1/2 committees join 461.26 us/op 624.70 us/op 0.74
altair processAttestation - setStatus - 2/3 committees join 589.87 us/op 873.25 us/op 0.68
altair processAttestation - setStatus - 4/5 committees join 782.44 us/op 1.1522 ms/op 0.68
altair processAttestation - setStatus - 100% committees join 879.79 us/op 1.1969 ms/op 0.74
altair processBlock - 250000 vs - 7PWei normalcase 10.522 ms/op 17.274 ms/op 0.61
altair processBlock - 250000 vs - 7PWei normalcase hashState 44.050 ms/op 57.265 ms/op 0.77
altair processBlock - 250000 vs - 7PWei worstcase 39.148 ms/op 58.490 ms/op 0.67
altair processBlock - 250000 vs - 7PWei worstcase hashState 98.804 ms/op 139.63 ms/op 0.71
phase0 processBlock - 250000 vs - 7PWei normalcase 2.9702 ms/op 5.4960 ms/op 0.54
phase0 processBlock - 250000 vs - 7PWei worstcase 29.493 ms/op 40.786 ms/op 0.72
altair processEth1Data - 250000 vs - 7PWei normalcase 507.38 us/op 1.0833 ms/op 0.47
getExpectedWithdrawals 250000 eb:1,eth1:1,we:0,wn:0,smpl:15 7.2960 us/op 26.646 us/op 0.27
getExpectedWithdrawals 250000 eb:0.95,eth1:0.1,we:0.05,wn:0,smpl:219 43.387 us/op 126.28 us/op 0.34
getExpectedWithdrawals 250000 eb:0.95,eth1:0.3,we:0.05,wn:0,smpl:42 18.796 us/op 31.306 us/op 0.60
getExpectedWithdrawals 250000 eb:0.95,eth1:0.7,we:0.05,wn:0,smpl:18 7.2640 us/op 27.636 us/op 0.26
getExpectedWithdrawals 250000 eb:0.1,eth1:0.1,we:0,wn:0,smpl:1020 183.51 us/op 235.91 us/op 0.78
getExpectedWithdrawals 250000 eb:0.03,eth1:0.03,we:0,wn:0,smpl:11777 1.0331 ms/op 2.9163 ms/op 0.35
getExpectedWithdrawals 250000 eb:0.01,eth1:0.01,we:0,wn:0,smpl:16384 1.4752 ms/op 3.2881 ms/op 0.45
getExpectedWithdrawals 250000 eb:0,eth1:0,we:0,wn:0,smpl:16384 1.4871 ms/op 2.8855 ms/op 0.52
getExpectedWithdrawals 250000 eb:0,eth1:0,we:0,wn:0,nocache,smpl:16384 3.4654 ms/op 5.7392 ms/op 0.60
getExpectedWithdrawals 250000 eb:0,eth1:1,we:0,wn:0,smpl:16384 2.7168 ms/op 3.4394 ms/op 0.79
getExpectedWithdrawals 250000 eb:0,eth1:1,we:0,wn:0,nocache,smpl:16384 4.9557 ms/op 10.660 ms/op 0.46
Tree 40 250000 create 345.66 ms/op 855.15 ms/op 0.40
Tree 40 250000 get(125000) 193.01 ns/op 230.35 ns/op 0.84
Tree 40 250000 set(125000) 872.10 ns/op 1.3870 us/op 0.63
Tree 40 250000 toArray() 17.503 ms/op 24.882 ms/op 0.70
Tree 40 250000 iterate all - toArray() + loop 17.572 ms/op 28.085 ms/op 0.63
Tree 40 250000 iterate all - get(i) 62.963 ms/op 82.785 ms/op 0.76
MutableVector 250000 create 14.197 ms/op 22.605 ms/op 0.63
MutableVector 250000 get(125000) 6.5110 ns/op 7.0330 ns/op 0.93
MutableVector 250000 set(125000) 276.39 ns/op 438.14 ns/op 0.63
MutableVector 250000 toArray() 3.1697 ms/op 5.6022 ms/op 0.57
MutableVector 250000 iterate all - toArray() + loop 3.8662 ms/op 5.6824 ms/op 0.68
MutableVector 250000 iterate all - get(i) 1.5257 ms/op 1.7159 ms/op 0.89
Array 250000 create 2.8020 ms/op 6.0469 ms/op 0.46
Array 250000 clone - spread 1.2148 ms/op 2.4923 ms/op 0.49
Array 250000 get(125000) 1.0340 ns/op 2.4120 ns/op 0.43
Array 250000 set(125000) 4.1000 ns/op 6.1790 ns/op 0.66
Array 250000 iterate all - loop 164.78 us/op 198.24 us/op 0.83
effectiveBalanceIncrements clone Uint8Array 300000 26.672 us/op 83.049 us/op 0.32
effectiveBalanceIncrements clone MutableVector 300000 356.00 ns/op 538.00 ns/op 0.66
effectiveBalanceIncrements rw all Uint8Array 300000 197.96 us/op 209.70 us/op 0.94
effectiveBalanceIncrements rw all MutableVector 300000 80.706 ms/op 140.29 ms/op 0.58
phase0 afterProcessEpoch - 250000 vs - 7PWei 112.38 ms/op 126.02 ms/op 0.89
phase0 beforeProcessEpoch - 250000 vs - 7PWei 52.181 ms/op 61.049 ms/op 0.85
altair processEpoch - mainnet_e81889 486.94 ms/op 616.40 ms/op 0.79
mainnet_e81889 - altair beforeProcessEpoch 79.089 ms/op 123.82 ms/op 0.64
mainnet_e81889 - altair processJustificationAndFinalization 15.100 us/op 26.018 us/op 0.58
mainnet_e81889 - altair processInactivityUpdates 6.0451 ms/op 8.1916 ms/op 0.74
mainnet_e81889 - altair processRewardsAndPenalties 61.965 ms/op 74.802 ms/op 0.83
mainnet_e81889 - altair processRegistryUpdates 2.4470 us/op 3.7960 us/op 0.64
mainnet_e81889 - altair processSlashings 417.00 ns/op 727.00 ns/op 0.57
mainnet_e81889 - altair processEth1DataReset 484.00 ns/op 935.00 ns/op 0.52
mainnet_e81889 - altair processEffectiveBalanceUpdates 1.4235 ms/op 1.9928 ms/op 0.71
mainnet_e81889 - altair processSlashingsReset 3.7710 us/op 9.6330 us/op 0.39
mainnet_e81889 - altair processRandaoMixesReset 4.0990 us/op 8.3430 us/op 0.49
mainnet_e81889 - altair processHistoricalRootsUpdate 840.00 ns/op 882.00 ns/op 0.95
mainnet_e81889 - altair processParticipationFlagUpdates 1.3660 us/op 2.6210 us/op 0.52
mainnet_e81889 - altair processSyncCommitteeUpdates 566.00 ns/op 1.0270 us/op 0.55
mainnet_e81889 - altair afterProcessEpoch 117.47 ms/op 127.13 ms/op 0.92
capella processEpoch - mainnet_e217614 2.1016 s/op 2.5535 s/op 0.82
mainnet_e217614 - capella beforeProcessEpoch 494.46 ms/op 540.00 ms/op 0.92
mainnet_e217614 - capella processJustificationAndFinalization 14.692 us/op 30.922 us/op 0.48
mainnet_e217614 - capella processInactivityUpdates 17.506 ms/op 28.395 ms/op 0.62
mainnet_e217614 - capella processRewardsAndPenalties 394.95 ms/op 463.85 ms/op 0.85
mainnet_e217614 - capella processRegistryUpdates 25.771 us/op 30.967 us/op 0.83
mainnet_e217614 - capella processSlashings 618.00 ns/op 814.00 ns/op 0.76
mainnet_e217614 - capella processEth1DataReset 402.00 ns/op 714.00 ns/op 0.56
mainnet_e217614 - capella processEffectiveBalanceUpdates 4.5767 ms/op 6.3268 ms/op 0.72
mainnet_e217614 - capella processSlashingsReset 3.3580 us/op 6.4720 us/op 0.52
mainnet_e217614 - capella processRandaoMixesReset 4.9930 us/op 7.5620 us/op 0.66
mainnet_e217614 - capella processHistoricalRootsUpdate 433.00 ns/op 1.1990 us/op 0.36
mainnet_e217614 - capella processParticipationFlagUpdates 1.4870 us/op 4.3360 us/op 0.34
mainnet_e217614 - capella afterProcessEpoch 315.35 ms/op 371.76 ms/op 0.85
phase0 processEpoch - mainnet_e58758 426.13 ms/op 739.45 ms/op 0.58
mainnet_e58758 - phase0 beforeProcessEpoch 111.83 ms/op 203.35 ms/op 0.55
mainnet_e58758 - phase0 processJustificationAndFinalization 14.746 us/op 32.653 us/op 0.45
mainnet_e58758 - phase0 processRewardsAndPenalties 52.631 ms/op 64.884 ms/op 0.81
mainnet_e58758 - phase0 processRegistryUpdates 9.3540 us/op 25.976 us/op 0.36
mainnet_e58758 - phase0 processSlashings 579.00 ns/op 1.0570 us/op 0.55
mainnet_e58758 - phase0 processEth1DataReset 401.00 ns/op 1.1170 us/op 0.36
mainnet_e58758 - phase0 processEffectiveBalanceUpdates 1.0985 ms/op 1.6747 ms/op 0.66
mainnet_e58758 - phase0 processSlashingsReset 2.6530 us/op 7.5670 us/op 0.35
mainnet_e58758 - phase0 processRandaoMixesReset 4.0430 us/op 9.4380 us/op 0.43
mainnet_e58758 - phase0 processHistoricalRootsUpdate 401.00 ns/op 1.1370 us/op 0.35
mainnet_e58758 - phase0 processParticipationRecordUpdates 4.2140 us/op 9.3040 us/op 0.45
mainnet_e58758 - phase0 afterProcessEpoch 92.387 ms/op 105.97 ms/op 0.87
phase0 processEffectiveBalanceUpdates - 250000 normalcase 1.3227 ms/op 2.1864 ms/op 0.60
phase0 processEffectiveBalanceUpdates - 250000 worstcase 0.5 1.5377 ms/op 2.5729 ms/op 0.60
altair processInactivityUpdates - 250000 normalcase 28.024 ms/op 39.608 ms/op 0.71
altair processInactivityUpdates - 250000 worstcase 26.914 ms/op 50.352 ms/op 0.53
phase0 processRegistryUpdates - 250000 normalcase 8.1800 us/op 20.662 us/op 0.40
phase0 processRegistryUpdates - 250000 badcase_full_deposits 271.50 us/op 580.12 us/op 0.47
phase0 processRegistryUpdates - 250000 worstcase 0.5 143.40 ms/op 183.57 ms/op 0.78
altair processRewardsAndPenalties - 250000 normalcase 60.825 ms/op 68.748 ms/op 0.88
altair processRewardsAndPenalties - 250000 worstcase 59.431 ms/op 84.115 ms/op 0.71
phase0 getAttestationDeltas - 250000 normalcase 9.2098 ms/op 14.683 ms/op 0.63
phase0 getAttestationDeltas - 250000 worstcase 9.1673 ms/op 13.037 ms/op 0.70
phase0 processSlashings - 250000 worstcase 86.069 us/op 126.56 us/op 0.68
altair processSyncCommitteeUpdates - 250000 166.65 ms/op 213.77 ms/op 0.78
BeaconState.hashTreeRoot - No change 265.00 ns/op 303.00 ns/op 0.87
BeaconState.hashTreeRoot - 1 full validator 128.49 us/op 178.64 us/op 0.72
BeaconState.hashTreeRoot - 32 full validator 1.6303 ms/op 1.9040 ms/op 0.86
BeaconState.hashTreeRoot - 512 full validator 21.116 ms/op 22.272 ms/op 0.95
BeaconState.hashTreeRoot - 1 validator.effectiveBalance 178.26 us/op 245.27 us/op 0.73
BeaconState.hashTreeRoot - 32 validator.effectiveBalance 2.3757 ms/op 3.0218 ms/op 0.79
BeaconState.hashTreeRoot - 512 validator.effectiveBalance 35.512 ms/op 37.984 ms/op 0.93
BeaconState.hashTreeRoot - 1 balances 171.60 us/op 166.06 us/op 1.03
BeaconState.hashTreeRoot - 32 balances 1.4198 ms/op 1.5801 ms/op 0.90
BeaconState.hashTreeRoot - 512 balances 13.814 ms/op 15.919 ms/op 0.87
BeaconState.hashTreeRoot - 250000 balances 228.07 ms/op 291.74 ms/op 0.78
aggregationBits - 2048 els - zipIndexesInBitList 18.369 us/op 34.288 us/op 0.54
byteArrayEquals 32 76.896 ns/op 90.774 ns/op 0.85
Buffer.compare 32 57.047 ns/op 58.936 ns/op 0.97
byteArrayEquals 1024 2.0963 us/op 2.1640 us/op 0.97
Buffer.compare 1024 72.313 ns/op 81.118 ns/op 0.89
byteArrayEquals 16384 33.210 us/op 37.212 us/op 0.89
Buffer.compare 16384 254.10 ns/op 317.16 ns/op 0.80
byteArrayEquals 123687377 245.59 ms/op 285.15 ms/op 0.86
Buffer.compare 123687377 6.1528 ms/op 8.4471 ms/op 0.73
byteArrayEquals 32 - diff last byte 72.164 ns/op 83.331 ns/op 0.87
Buffer.compare 32 - diff last byte 58.038 ns/op 60.469 ns/op 0.96
byteArrayEquals 1024 - diff last byte 2.0513 us/op 2.1362 us/op 0.96
Buffer.compare 1024 - diff last byte 72.511 ns/op 76.499 ns/op 0.95
byteArrayEquals 16384 - diff last byte 32.520 us/op 36.493 us/op 0.89
Buffer.compare 16384 - diff last byte 256.91 ns/op 318.87 ns/op 0.81
byteArrayEquals 123687377 - diff last byte 245.45 ms/op 296.69 ms/op 0.83
Buffer.compare 123687377 - diff last byte 6.3032 ms/op 9.1159 ms/op 0.69
byteArrayEquals 32 - random bytes 5.3230 ns/op 7.1290 ns/op 0.75
Buffer.compare 32 - random bytes 60.614 ns/op 70.094 ns/op 0.86
byteArrayEquals 1024 - random bytes 5.2550 ns/op 6.1880 ns/op 0.85
Buffer.compare 1024 - random bytes 59.677 ns/op 66.605 ns/op 0.90
byteArrayEquals 16384 - random bytes 5.2440 ns/op 6.3000 ns/op 0.83
Buffer.compare 16384 - random bytes 59.733 ns/op 69.270 ns/op 0.86
byteArrayEquals 123687377 - random bytes 8.7000 ns/op 9.2700 ns/op 0.94
Buffer.compare 123687377 - random bytes 63.280 ns/op 74.540 ns/op 0.85
regular array get 100000 times 44.254 us/op 49.394 us/op 0.90
wrappedArray get 100000 times 44.174 us/op 48.543 us/op 0.91
arrayWithProxy get 100000 times 14.290 ms/op 15.502 ms/op 0.92
ssz.Root.equals 54.520 ns/op 59.572 ns/op 0.92
byteArrayEquals 53.297 ns/op 58.565 ns/op 0.91
Buffer.compare 10.844 ns/op 12.540 ns/op 0.86
shuffle list - 16384 els 6.9362 ms/op 7.3903 ms/op 0.94
shuffle list - 250000 els 102.32 ms/op 108.71 ms/op 0.94
processSlot - 1 slots 18.181 us/op 19.208 us/op 0.95
processSlot - 32 slots 3.3013 ms/op 4.5259 ms/op 0.73
getEffectiveBalanceIncrementsZeroInactive - 250000 vs - 7PWei 60.710 ms/op 60.510 ms/op 1.00
getCommitteeAssignments - req 1 vs - 250000 vc 2.5329 ms/op 2.7159 ms/op 0.93
getCommitteeAssignments - req 100 vs - 250000 vc 3.7570 ms/op 4.0714 ms/op 0.92
getCommitteeAssignments - req 1000 vs - 250000 vc 4.0970 ms/op 4.3140 ms/op 0.95
findModifiedValidators - 10000 modified validators 546.41 ms/op 573.92 ms/op 0.95
findModifiedValidators - 1000 modified validators 420.55 ms/op 447.02 ms/op 0.94
findModifiedValidators - 100 modified validators 426.39 ms/op 458.15 ms/op 0.93
findModifiedValidators - 10 modified validators 388.02 ms/op 463.46 ms/op 0.84
findModifiedValidators - 1 modified validators 386.85 ms/op 485.65 ms/op 0.80
findModifiedValidators - no difference 395.86 ms/op 455.52 ms/op 0.87
compare ViewDUs 4.3453 s/op 5.1169 s/op 0.85
compare each validator Uint8Array 1.7407 s/op 1.9278 s/op 0.90
compare ViewDU to Uint8Array 1.1985 s/op 1.4412 s/op 0.83
migrate state 1000000 validators, 24 modified, 0 new 760.61 ms/op 903.31 ms/op 0.84
migrate state 1000000 validators, 1700 modified, 1000 new 1.0826 s/op 1.2271 s/op 0.88
migrate state 1000000 validators, 3400 modified, 2000 new 1.2920 s/op 1.5579 s/op 0.83
migrate state 1500000 validators, 24 modified, 0 new 797.58 ms/op 1.0198 s/op 0.78
migrate state 1500000 validators, 1700 modified, 1000 new 1.1021 s/op 1.4855 s/op 0.74
migrate state 1500000 validators, 3400 modified, 2000 new 1.3025 s/op 1.9378 s/op 0.67
RootCache.getBlockRootAtSlot - 250000 vs - 7PWei 4.3700 ns/op 6.4800 ns/op 0.67
state getBlockRootAtSlot - 250000 vs - 7PWei 675.98 ns/op 819.92 ns/op 0.82
computeProposers - vc 250000 9.9468 ms/op 11.624 ms/op 0.86
computeEpochShuffling - vc 250000 103.09 ms/op 119.66 ms/op 0.86
getNextSyncCommittee - vc 250000 154.34 ms/op 225.91 ms/op 0.68
computeSigningRoot for AttestationData 23.429 us/op 33.525 us/op 0.70
hash AttestationData serialized data then Buffer.toString(base64) 2.3534 us/op 2.7060 us/op 0.87
toHexString serialized data 1.0529 us/op 1.5569 us/op 0.68
Buffer.toString(base64) 212.21 ns/op 255.00 ns/op 0.83

by benchmarkbot/action

@twoeths twoeths marked this pull request as ready for review January 9, 2024 10:05
@twoeths twoeths requested a review from a team as a code owner January 9, 2024 10:05
Copy link

codecov bot commented Jan 12, 2024

Codecov Report

Merging #6269 (1fc6ed9) into unstable (ea49409) will decrease coverage by 3.86%.
Report is 31 commits behind head on unstable.
The diff coverage is n/a.

Additional details and impacted files
@@             Coverage Diff              @@
##           unstable    #6269      +/-   ##
============================================
- Coverage     80.38%   76.53%   -3.86%     
============================================
  Files           202      248      +46     
  Lines         19622    25943    +6321     
  Branches       1176     1449     +273     
============================================
+ Hits          15773    19855    +4082     
- Misses         3821     6058    +2237     
- Partials         28       30       +2     

const CHECKPOINT_FILE_NAME_LENGTH = 82;

/**
* Implementation of CPStatePersistentApis using file system, this is beneficial for debugging.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Implementation of CPStatePersistentApis using file system, this is beneficial for debugging.
* Implementation of CPStateDatastore using file system, this is beneficial for debugging.

Comment on lines 630 to 641
let bufferPoolKey: number | undefined = undefined;
try {
const timer = this.metrics?.statePersistDuration.startTimer();
const stateBytesWithKey = this.serializeState(state);
bufferPoolKey = stateBytesWithKey.key;
persistedKey = await this.datastore.write(cpPersist, stateBytesWithKey.data);
timer?.();
} finally {
if (bufferPoolKey !== undefined) {
this.bufferPool?.free(bufferPoolKey);
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this pattern seems like the ideal usecase for the using pattern from

Suggested change
let bufferPoolKey: number | undefined = undefined;
try {
const timer = this.metrics?.statePersistDuration.startTimer();
const stateBytesWithKey = this.serializeState(state);
bufferPoolKey = stateBytesWithKey.key;
persistedKey = await this.datastore.write(cpPersist, stateBytesWithKey.data);
timer?.();
} finally {
if (bufferPoolKey !== undefined) {
this.bufferPool?.free(bufferPoolKey);
}
}
{
const timer = this.metrics?.statePersistDuration.startTimer();
using stateBytesWithKey = this.serializeState(state);
bufferPoolKey = stateBytesWithKey.key;
persistedKey = await this.datastore.write(cpPersist, stateBytesWithKey.data);
timer?.();
}

I'm not sure that we should use this feature, but it's worth knowing about.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wemeetagain thanks it's a nice feature and really suitable for this code place, implemented it 👍

/**
* Marks the buffer as free.
*/
free(key: number): void {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering why we need to track a currentKey at all? But I guess we need to handle cases where two calls to getOrReload can happen at the same time?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes at least now we call it in 2 flows: to serialize state at every epoch and to serialize validators to reload state. This is mainly to avoid memory allocation, plus we may have an optimized way to serialize validators as in serializeState.test.ts

Comment on lines 19 to 20
// for test only
processLateBlock?: boolean;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// for test only
processLateBlock?: boolean;
/** for testing only */
processLateBlock?: boolean;

import {CheckpointHex, CacheItemType, CheckpointStateCache} from "./types.js";

export type PersistentCheckpointStateCacheOpts = {
// Keep max n states in memory, persist the rest to disk
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Keep max n states in memory, persist the rest to disk
/** Keep max n states in memory, persist the rest to disk */

Copy link
Member

@wemeetagain wemeetagain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@wemeetagain wemeetagain merged commit 8cc5f04 into unstable Jan 18, 2024
13 of 15 checks passed
@wemeetagain wemeetagain deleted the tuyen/persistent_cp_state_cache_buffer_pool branch January 18, 2024 16:21
@twoeths
Copy link
Contributor Author

twoeths commented Jan 19, 2024

thanks @nazarhussain for the vitest fix ❤️

ensi321 pushed a commit to ensi321/lodestar that referenced this pull request Jan 22, 2024
* feat: implement BufferPool for PersistentCPStateCache

* fix: alloc vs allocUnsafe for BufferPool

* chore: conform to style guide

* feat: use using with Disposable object

* Add custom build target for beacon-node unit tests

* chore: address PR comments

---------

Co-authored-by: Nazar Hussain <nazarhussain@gmail.com>
@wemeetagain
Copy link
Member

🎉 This PR is included in v1.15.0 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants