Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: improve digest64 for as-sha256 #347

Closed
wants to merge 9 commits into from
Closed

Conversation

twoeths
Copy link
Contributor

@twoeths twoeths commented Feb 28, 2024

Motivation

Improve digest64

Description

  • Use latest assemblyscript yields better performance
  • Chain w computation to the main loop in hashBlocks()
  • Return result by using Uint8Array.slice()

Before:

digestTwoHashObjects vs digest64 vs digest
    ✓ digestTwoHashObjects 50023 times                                    27.71683 ops/s    36.07916 ms/op   x1.042        264 runs   10.1 s
    ✓ digest64 50023 times                                                26.85761 ops/s    37.23339 ms/op   x1.020        256 runs   10.1 s
    ✓ digest 50023 times                                                  27.02430 ops/s    37.00373 ms/op   x1.010        259 runs   10.1 s

After

digestTwoHashObjects vs digest64 vs digest
    ✓ digestTwoHashObjects 50023 times                                    30.54306 ops/s    32.74066 ms/op   x0.946        293 runs   10.1 s
    ✓ digest64 50023 times                                                28.28273 ops/s    35.35727 ms/op   x0.968        271 runs   10.1 s
    ✓ digest 50023 times                                                  27.93469 ops/s    35.79778 ms/op   x0.977        268 runs   10.1 s
  

digestTwoHashObjects is ~10% faster and digest64 is ~5% faster

Copy link

github-actions bot commented Feb 28, 2024

Performance Report

✔️ no performance regression detected

Full benchmark results
Benchmark suite Current: c98ffc3 Previous: 6220d32 Ratio
digestTwoHashObjects 50023 times 50.422 ms/op 47.451 ms/op 1.06
digest64 50023 times 52.434 ms/op 48.513 ms/op 1.08
digest 50023 times 56.875 ms/op 49.148 ms/op 1.16
input length 32 1.4860 us/op 1.1400 us/op 1.30
input length 64 1.6190 us/op 1.2700 us/op 1.27
input length 128 2.6960 us/op 2.2140 us/op 1.22
input length 256 3.9120 us/op 3.3030 us/op 1.18
input length 512 6.3690 us/op 5.4920 us/op 1.16
input length 1024 12.181 us/op 10.835 us/op 1.12
digest 1000000 times 915.35 ms/op 777.08 ms/op 1.18
hashObjectToByteArray 50023 times 1.4298 ms/op 1.4645 ms/op 0.98
byteArrayToHashObject 50023 times 3.5234 ms/op 1.6627 ms/op 2.12
getGindicesAtDepth 4.9400 us/op 3.9520 us/op 1.25
iterateAtDepth 10.341 us/op 8.4810 us/op 1.22
getGindexBits 550.00 ns/op 423.00 ns/op 1.30
gindexIterator 1.2260 us/op 952.00 ns/op 1.29
hash 2 Uint8Array 2250026 times - as-sha256 2.4520 s/op 2.2238 s/op 1.10
hashTwoObjects 2250026 times - as-sha256 2.3468 s/op 2.1564 s/op 1.09
hash 2 Uint8Array 2250026 times - noble 5.7821 s/op 4.6188 s/op 1.25
hashTwoObjects 2250026 times - noble 6.7199 s/op 6.7975 s/op 0.99
getNodeH() x7812.5 avg hindex 15.143 us/op 14.548 us/op 1.04
getNodeH() x7812.5 index 0 5.1000 us/op 5.1490 us/op 0.99
getNodeH() x7812.5 index 7 5.0850 us/op 5.1510 us/op 0.99
getNodeH() x7812.5 index 7 with key array 5.1250 us/op 5.0550 us/op 1.01
new LeafNode() x7812.5 185.96 us/op 113.37 us/op 1.64
multiproof - depth 15, 1 requested leaves 10.461 us/op 9.4680 us/op 1.10
tree offset multiproof - depth 15, 1 requested leaves 21.004 us/op 20.621 us/op 1.02
compact multiproof - depth 15, 1 requested leaves 5.4010 us/op 5.5070 us/op 0.98
multiproof - depth 15, 2 requested leaves 13.529 us/op 12.922 us/op 1.05
tree offset multiproof - depth 15, 2 requested leaves 24.425 us/op 23.147 us/op 1.06
compact multiproof - depth 15, 2 requested leaves 3.3670 us/op 3.3580 us/op 1.00
multiproof - depth 15, 3 requested leaves 18.887 us/op 17.866 us/op 1.06
tree offset multiproof - depth 15, 3 requested leaves 32.142 us/op 30.400 us/op 1.06
compact multiproof - depth 15, 3 requested leaves 5.5470 us/op 4.6320 us/op 1.20
multiproof - depth 15, 4 requested leaves 23.653 us/op 23.828 us/op 0.99
tree offset multiproof - depth 15, 4 requested leaves 36.790 us/op 38.089 us/op 0.97
compact multiproof - depth 15, 4 requested leaves 5.3340 us/op 5.2770 us/op 1.01
packedRootsBytesToLeafNodes bytes 4000 offset 0 1.9700 us/op 1.9990 us/op 0.99
packedRootsBytesToLeafNodes bytes 4000 offset 1 1.9630 us/op 2.0010 us/op 0.98
packedRootsBytesToLeafNodes bytes 4000 offset 2 1.9750 us/op 2.0090 us/op 0.98
packedRootsBytesToLeafNodes bytes 4000 offset 3 1.9600 us/op 1.9920 us/op 0.98
subtreeFillToContents depth 40 count 250000 44.369 ms/op 43.686 ms/op 1.02
setRoot - gindexBitstring 7.7733 ms/op 8.7390 ms/op 0.89
setRoot - gindex 7.9447 ms/op 9.4315 ms/op 0.84
getRoot - gindexBitstring 2.4620 ms/op 2.4095 ms/op 1.02
getRoot - gindex 3.0494 ms/op 3.2480 ms/op 0.94
getHashObject then setHashObject 8.9442 ms/op 10.367 ms/op 0.86
setNodeWithFn 7.7888 ms/op 9.1894 ms/op 0.85
getNodeAtDepth depth 0 x100000 1.1449 ms/op 1.1471 ms/op 1.00
setNodeAtDepth depth 0 x100000 2.3224 ms/op 2.7974 ms/op 0.83
getNodesAtDepth depth 0 x100000 1.0851 ms/op 1.0834 ms/op 1.00
setNodesAtDepth depth 0 x100000 1.4850 ms/op 1.4935 ms/op 0.99
getNodeAtDepth depth 1 x100000 1.2058 ms/op 1.2074 ms/op 1.00
setNodeAtDepth depth 1 x100000 5.0174 ms/op 5.7980 ms/op 0.87
getNodesAtDepth depth 1 x100000 1.2103 ms/op 1.2088 ms/op 1.00
setNodesAtDepth depth 1 x100000 4.2492 ms/op 4.6802 ms/op 0.91
getNodeAtDepth depth 2 x100000 1.4841 ms/op 1.4852 ms/op 1.00
setNodeAtDepth depth 2 x100000 8.7215 ms/op 10.474 ms/op 0.83
getNodesAtDepth depth 2 x100000 17.382 ms/op 20.510 ms/op 0.85
setNodesAtDepth depth 2 x100000 12.650 ms/op 13.700 ms/op 0.92
tree.getNodesAtDepth - gindexes 5.2689 ms/op 6.0009 ms/op 0.88
tree.getNodesAtDepth - push all nodes 1.7729 ms/op 2.2066 ms/op 0.80
tree.getNodesAtDepth - navigation 155.36 us/op 157.91 us/op 0.98
tree.setNodesAtDepth - indexes 301.87 us/op 374.22 us/op 0.81
set at depth 8 470.00 ns/op 514.00 ns/op 0.91
set at depth 16 611.00 ns/op 664.00 ns/op 0.92
set at depth 32 929.00 ns/op 1.0370 us/op 0.90
iterateNodesAtDepth 8 256 13.531 us/op 13.906 us/op 0.97
getNodesAtDepth 8 256 3.3180 us/op 3.3930 us/op 0.98
iterateNodesAtDepth 16 65536 4.1928 ms/op 4.2306 ms/op 0.99
getNodesAtDepth 16 65536 1.5660 ms/op 2.0146 ms/op 0.78
iterateNodesAtDepth 32 250000 15.631 ms/op 16.975 ms/op 0.92
getNodesAtDepth 32 250000 4.2045 ms/op 4.2259 ms/op 0.99
iterateNodesAtDepth 40 250000 15.293 ms/op 14.758 ms/op 1.04
getNodesAtDepth 40 250000 4.2270 ms/op 4.2287 ms/op 1.00
250k validators 6.8209 s/op 7.0207 s/op 0.97
bitlist bytes to struct (120,90) 603.00 ns/op 576.00 ns/op 1.05
bitlist bytes to tree (120,90) 2.3540 us/op 2.2300 us/op 1.06
bitlist bytes to struct (2048,2048) 1.0210 us/op 998.00 ns/op 1.02
bitlist bytes to tree (2048,2048) 3.6340 us/op 3.4800 us/op 1.04
ByteListType - deserialize 8.1662 ms/op 8.1587 ms/op 1.00
BasicListType - deserialize 7.6215 ms/op 7.8204 ms/op 0.97
ByteListType - serialize 8.2848 ms/op 7.4434 ms/op 1.11
BasicListType - serialize 9.8041 ms/op 9.9380 ms/op 0.99
BasicListType - tree_convertToStruct 21.497 ms/op 21.339 ms/op 1.01
List[uint8, 68719476736] len 300000 ViewDU.getAll() + iterate 4.0829 ms/op 4.2130 ms/op 0.97
List[uint8, 68719476736] len 300000 ViewDU.get(i) 4.1790 ms/op 4.3100 ms/op 0.97
Array.push len 300000 empty Array - number 6.0675 ms/op 6.2098 ms/op 0.98
Array.set len 300000 from new Array - number 1.6174 ms/op 1.6359 ms/op 0.99
Array.set len 300000 - number 5.1398 ms/op 5.0942 ms/op 1.01
Uint8Array.set len 300000 203.55 us/op 208.02 us/op 0.98
Uint32Array.set len 300000 273.95 us/op 291.01 us/op 0.94
Container({a: uint8, b: uint8}) getViewDU x300000 19.551 ms/op 19.737 ms/op 0.99
ContainerNodeStruct({a: uint8, b: uint8}) getViewDU x300000 9.3055 ms/op 9.3814 ms/op 0.99
List(Container) len 300000 ViewDU.getAllReadonly() + iterate 198.82 ms/op 246.18 ms/op 0.81
List(Container) len 300000 ViewDU.getAllReadonlyValues() + iterate 280.86 ms/op 293.67 ms/op 0.96
List(Container) len 300000 ViewDU.get(i) 6.3686 ms/op 6.6744 ms/op 0.95
List(Container) len 300000 ViewDU.getReadonly(i) 6.1368 ms/op 6.5862 ms/op 0.93
List(ContainerNodeStruct) len 300000 ViewDU.getAllReadonly() + iterate 36.704 ms/op 38.008 ms/op 0.97
List(ContainerNodeStruct) len 300000 ViewDU.getAllReadonlyValues() + iterate 5.1006 ms/op 5.1821 ms/op 0.98
List(ContainerNodeStruct) len 300000 ViewDU.get(i) 6.0304 ms/op 6.1416 ms/op 0.98
List(ContainerNodeStruct) len 300000 ViewDU.getReadonly(i) 5.9073 ms/op 5.9534 ms/op 0.99
Array.push len 300000 empty Array - object 5.7186 ms/op 5.8513 ms/op 0.98
Array.set len 300000 from new Array - object 1.9168 ms/op 1.9138 ms/op 1.00
Array.set len 300000 - object 5.4213 ms/op 5.7212 ms/op 0.95
cachePermanentRootStruct no cache 9.1910 us/op 8.9470 us/op 1.03
cachePermanentRootStruct with cache 218.00 ns/op 217.00 ns/op 1.00
epochParticipation len 250000 rws 7813 2.1996 ms/op 2.3121 ms/op 0.95
deserialize Attestation - tree 2.9210 us/op 2.8750 us/op 1.02
deserialize Attestation - struct 1.9580 us/op 1.9240 us/op 1.02
deserialize SignedAggregateAndProof - tree 3.6560 us/op 3.6870 us/op 0.99
deserialize SignedAggregateAndProof - struct 2.9860 us/op 2.9650 us/op 1.01
deserialize SyncCommitteeMessage - tree 1.1020 us/op 1.1820 us/op 0.93
deserialize SyncCommitteeMessage - struct 1.1780 us/op 1.2800 us/op 0.92
deserialize SignedContributionAndProof - tree 1.9370 us/op 1.9150 us/op 1.01
deserialize SignedContributionAndProof - struct 2.4380 us/op 2.4950 us/op 0.98
deserialize SignedBeaconBlock - tree 208.59 us/op 218.40 us/op 0.96
deserialize SignedBeaconBlock - struct 124.47 us/op 125.55 us/op 0.99
BeaconState vc 300000 - deserialize tree 541.71 ms/op 647.97 ms/op 0.84
BeaconState vc 300000 - serialize tree 116.93 ms/op 142.22 ms/op 0.82
BeaconState.historicalRoots vc 300000 - deserialize tree 861.00 ns/op 826.00 ns/op 1.04
BeaconState.historicalRoots vc 300000 - serialize tree 828.00 ns/op 800.00 ns/op 1.03
BeaconState.validators vc 300000 - deserialize tree 494.51 ms/op 646.73 ms/op 0.76
BeaconState.validators vc 300000 - serialize tree 120.82 ms/op 133.13 ms/op 0.91
BeaconState.balances vc 300000 - deserialize tree 19.744 ms/op 23.300 ms/op 0.85
BeaconState.balances vc 300000 - serialize tree 3.1007 ms/op 3.2543 ms/op 0.95
BeaconState.previousEpochParticipation vc 300000 - deserialize tree 375.95 us/op 398.81 us/op 0.94
BeaconState.previousEpochParticipation vc 300000 - serialize tree 264.03 us/op 265.41 us/op 0.99
BeaconState.currentEpochParticipation vc 300000 - deserialize tree 369.74 us/op 403.44 us/op 0.92
BeaconState.currentEpochParticipation vc 300000 - serialize tree 261.99 us/op 267.45 us/op 0.98
BeaconState.inactivityScores vc 300000 - deserialize tree 20.539 ms/op 26.134 ms/op 0.79
BeaconState.inactivityScores vc 300000 - serialize tree 3.2488 ms/op 2.7084 ms/op 1.20
hashTreeRoot Attestation - struct 32.321 us/op 28.230 us/op 1.14
hashTreeRoot Attestation - tree 21.854 us/op 18.223 us/op 1.20
hashTreeRoot SignedAggregateAndProof - struct 43.954 us/op 38.544 us/op 1.14
hashTreeRoot SignedAggregateAndProof - tree 29.687 us/op 27.823 us/op 1.07
hashTreeRoot SyncCommitteeMessage - struct 10.439 us/op 9.1740 us/op 1.14
hashTreeRoot SyncCommitteeMessage - tree 6.5740 us/op 6.2360 us/op 1.05
hashTreeRoot SignedContributionAndProof - struct 29.414 us/op 26.322 us/op 1.12
hashTreeRoot SignedContributionAndProof - tree 21.208 us/op 19.913 us/op 1.07
hashTreeRoot SignedBeaconBlock - struct 2.4260 ms/op 2.4007 ms/op 1.01
hashTreeRoot SignedBeaconBlock - tree 1.7714 ms/op 1.6938 ms/op 1.05
hashTreeRoot Validator - struct 13.179 us/op 13.052 us/op 1.01
hashTreeRoot Validator - tree 11.492 us/op 11.057 us/op 1.04
BeaconState vc 300000 - hashTreeRoot tree 3.7534 s/op 3.6480 s/op 1.03
BeaconState.historicalRoots vc 300000 - hashTreeRoot tree 1.5020 us/op 1.5200 us/op 0.99
BeaconState.validators vc 300000 - hashTreeRoot tree 3.5881 s/op 3.5204 s/op 1.02
BeaconState.balances vc 300000 - hashTreeRoot tree 90.885 ms/op 90.631 ms/op 1.00
BeaconState.previousEpochParticipation vc 300000 - hashTreeRoot tree 9.5870 ms/op 9.3016 ms/op 1.03
BeaconState.currentEpochParticipation vc 300000 - hashTreeRoot tree 9.6064 ms/op 9.0244 ms/op 1.06
BeaconState.inactivityScores vc 300000 - hashTreeRoot tree 85.335 ms/op 86.049 ms/op 0.99
hash64 x18 20.566 us/op 19.179 us/op 1.07
hashTwoObjects x18 18.860 us/op 17.936 us/op 1.05
hash64 x1740 1.9647 ms/op 1.8306 ms/op 1.07
hashTwoObjects x1740 1.8020 ms/op 1.7038 ms/op 1.06
hash64 x2700000 3.0170 s/op 2.8221 s/op 1.07
hashTwoObjects x2700000 2.7907 s/op 2.6427 s/op 1.06
get_exitEpoch - ContainerType 224.00 ns/op 212.00 ns/op 1.06
get_exitEpoch - ContainerNodeStructType 212.00 ns/op 207.00 ns/op 1.02
set_exitEpoch - ContainerType 246.00 ns/op 250.00 ns/op 0.98
set_exitEpoch - ContainerNodeStructType 219.00 ns/op 206.00 ns/op 1.06
get_pubkey - ContainerType 1.0230 us/op 959.00 ns/op 1.07
get_pubkey - ContainerNodeStructType 222.00 ns/op 212.00 ns/op 1.05
hashTreeRoot - ContainerType 391.00 ns/op 355.00 ns/op 1.10
hashTreeRoot - ContainerNodeStructType 428.00 ns/op 402.00 ns/op 1.06
createProof - ContainerType 3.8900 us/op 3.8640 us/op 1.01
createProof - ContainerNodeStructType 21.044 us/op 20.811 us/op 1.01
serialize - ContainerType 1.9690 us/op 1.9330 us/op 1.02
serialize - ContainerNodeStructType 1.5290 us/op 1.4700 us/op 1.04
set_exitEpoch_and_hashTreeRoot - ContainerType 4.2110 us/op 4.2160 us/op 1.00
set_exitEpoch_and_hashTreeRoot - ContainerNodeStructType 11.669 us/op 11.253 us/op 1.04
Array - for of 5.0560 us/op 5.0570 us/op 1.00
Array - for(;;) 4.3090 us/op 4.3590 us/op 0.99
basicListValue.readonlyValuesArray() 3.7785 ms/op 3.6135 ms/op 1.05
basicListValue.readonlyValuesArray() + loop all 3.8469 ms/op 3.6827 ms/op 1.04
compositeListValue.readonlyValuesArray() 28.718 ms/op 27.018 ms/op 1.06
compositeListValue.readonlyValuesArray() + loop all 24.882 ms/op 24.661 ms/op 1.01
Number64UintType - get balances list 4.0450 ms/op 4.2458 ms/op 0.95
Number64UintType - set balances list 10.964 ms/op 10.992 ms/op 1.00
Number64UintType - get and increase 10 then set 39.201 ms/op 36.256 ms/op 1.08
Number64UintType - increase 10 using applyDelta 16.489 ms/op 15.966 ms/op 1.03
Number64UintType - increase 10 using applyDeltaInBatch 16.189 ms/op 16.781 ms/op 0.96
tree_newTreeFromUint64Deltas 15.819 ms/op 17.078 ms/op 0.93
unsafeUint8ArrayToTree 28.508 ms/op 32.144 ms/op 0.89
bitLength(50) 232.00 ns/op 225.00 ns/op 1.03
bitLengthStr(50) 241.00 ns/op 238.00 ns/op 1.01
bitLength(8000) 225.00 ns/op 229.00 ns/op 0.98
bitLengthStr(8000) 281.00 ns/op 284.00 ns/op 0.99
bitLength(250000) 222.00 ns/op 219.00 ns/op 1.01
bitLengthStr(250000) 315.00 ns/op 311.00 ns/op 1.01
floor - Math.floor (53) 0.46426 ns/op 0.46415 ns/op 1.00
floor - << 0 (53) 0.47021 ns/op 0.46692 ns/op 1.01
floor - Math.floor (512) 0.46411 ns/op 0.46665 ns/op 0.99
floor - << 0 (512) 0.47051 ns/op 0.47120 ns/op 1.00
fnIf(0) 1.5460 ns/op 1.5463 ns/op 1.00
fnSwitch(0) 2.5146 ns/op 2.5137 ns/op 1.00
fnObj(0) 0.46456 ns/op 0.46444 ns/op 1.00
fnArr(0) 0.47123 ns/op 0.46435 ns/op 1.01
fnIf(4) 2.1634 ns/op 2.1653 ns/op 1.00
fnSwitch(4) 2.4749 ns/op 2.4724 ns/op 1.00
fnObj(4) 0.46528 ns/op 0.46471 ns/op 1.00
fnArr(4) 0.46381 ns/op 0.46520 ns/op 1.00
fnIf(9) 3.0936 ns/op 3.0949 ns/op 1.00
fnSwitch(9) 2.4735 ns/op 2.4881 ns/op 0.99
fnObj(9) 0.46458 ns/op 0.46454 ns/op 1.00
fnArr(9) 0.46460 ns/op 0.46414 ns/op 1.00
Container {a,b,vec} - as struct x100000 46.589 us/op 47.184 us/op 0.99
Container {a,b,vec} - as tree x100000 372.67 us/op 371.42 us/op 1.00
Container {a,vec,b} - as struct x100000 77.566 us/op 77.488 us/op 1.00
Container {a,vec,b} - as tree x100000 402.12 us/op 402.42 us/op 1.00
get 2 props x1000000 - rawObject 309.45 us/op 310.21 us/op 1.00
get 2 props x1000000 - proxy 72.112 ms/op 74.359 ms/op 0.97
get 2 props x1000000 - customObj 309.62 us/op 309.78 us/op 1.00
Simple object binary -> struct 738.00 ns/op 613.00 ns/op 1.20
Simple object binary -> tree_backed 1.9920 us/op 1.6940 us/op 1.18
Simple object struct -> tree_backed 2.6050 us/op 2.1200 us/op 1.23
Simple object tree_backed -> struct 2.1770 us/op 1.8700 us/op 1.16
Simple object struct -> binary 1.0740 us/op 913.00 ns/op 1.18
Simple object tree_backed -> binary 1.8160 us/op 1.6560 us/op 1.10
aggregationBits binary -> struct 706.00 ns/op 550.00 ns/op 1.28
aggregationBits binary -> tree_backed 2.6600 us/op 2.0700 us/op 1.29
aggregationBits struct -> tree_backed 3.0450 us/op 2.5230 us/op 1.21
aggregationBits tree_backed -> struct 1.3160 us/op 1.0490 us/op 1.25
aggregationBits struct -> binary 917.00 ns/op 754.00 ns/op 1.22
aggregationBits tree_backed -> binary 1.1560 us/op 973.00 ns/op 1.19
List(uint8) 100000 binary -> struct 1.3401 ms/op 1.2988 ms/op 1.03
List(uint8) 100000 binary -> tree_backed 86.537 us/op 86.831 us/op 1.00
List(uint8) 100000 struct -> tree_backed 1.3134 ms/op 1.3055 ms/op 1.01
List(uint8) 100000 tree_backed -> struct 914.10 us/op 953.73 us/op 0.96
List(uint8) 100000 struct -> binary 1.2212 ms/op 1.2184 ms/op 1.00
List(uint8) 100000 tree_backed -> binary 79.928 us/op 80.666 us/op 0.99
List(uint64Number) 100000 binary -> struct 1.1579 ms/op 1.2325 ms/op 0.94
List(uint64Number) 100000 binary -> tree_backed 2.9832 ms/op 3.7674 ms/op 0.79
List(uint64Number) 100000 struct -> tree_backed 4.5223 ms/op 5.0615 ms/op 0.89
List(uint64Number) 100000 tree_backed -> struct 2.0292 ms/op 2.1579 ms/op 0.94
List(uint64Number) 100000 struct -> binary 1.4061 ms/op 1.3691 ms/op 1.03
List(uint64Number) 100000 tree_backed -> binary 764.18 us/op 764.34 us/op 1.00
List(Uint64Bigint) 100000 binary -> struct 3.3298 ms/op 3.2282 ms/op 1.03
List(Uint64Bigint) 100000 binary -> tree_backed 3.0010 ms/op 3.8412 ms/op 0.78
List(Uint64Bigint) 100000 struct -> tree_backed 5.2933 ms/op 6.0559 ms/op 0.87
List(Uint64Bigint) 100000 tree_backed -> struct 4.1589 ms/op 4.0536 ms/op 1.03
List(Uint64Bigint) 100000 struct -> binary 2.0807 ms/op 2.0217 ms/op 1.03
List(Uint64Bigint) 100000 tree_backed -> binary 832.63 us/op 738.20 us/op 1.13
Vector(Root) 100000 binary -> struct 28.447 ms/op 33.636 ms/op 0.85
Vector(Root) 100000 binary -> tree_backed 23.861 ms/op 31.353 ms/op 0.76
Vector(Root) 100000 struct -> tree_backed 33.251 ms/op 41.874 ms/op 0.79
Vector(Root) 100000 tree_backed -> struct 41.762 ms/op 49.350 ms/op 0.85
Vector(Root) 100000 struct -> binary 1.8166 ms/op 1.9152 ms/op 0.95
Vector(Root) 100000 tree_backed -> binary 8.3545 ms/op 9.5106 ms/op 0.88
List(Validator) 100000 binary -> struct 100.14 ms/op 135.38 ms/op 0.74
List(Validator) 100000 binary -> tree_backed 249.34 ms/op 348.85 ms/op 0.71
List(Validator) 100000 struct -> tree_backed 288.63 ms/op 359.65 ms/op 0.80
List(Validator) 100000 tree_backed -> struct 189.17 ms/op 219.65 ms/op 0.86
List(Validator) 100000 struct -> binary 33.099 ms/op 30.378 ms/op 1.09
List(Validator) 100000 tree_backed -> binary 92.889 ms/op 104.90 ms/op 0.89
List(Validator-NS) 100000 binary -> struct 91.057 ms/op 119.72 ms/op 0.76
List(Validator-NS) 100000 binary -> tree_backed 145.08 ms/op 180.69 ms/op 0.80
List(Validator-NS) 100000 struct -> tree_backed 184.59 ms/op 211.23 ms/op 0.87
List(Validator-NS) 100000 tree_backed -> struct 148.78 ms/op 174.34 ms/op 0.85
List(Validator-NS) 100000 struct -> binary 34.233 ms/op 30.620 ms/op 1.12
List(Validator-NS) 100000 tree_backed -> binary 36.547 ms/op 35.677 ms/op 1.02
get epochStatuses - MutableVector 91.102 us/op 90.831 us/op 1.00
get epochStatuses - ViewDU 198.17 us/op 196.06 us/op 1.01
set epochStatuses - ListTreeView 1.3816 ms/op 1.4870 ms/op 0.93
set epochStatuses - ListTreeView - set() 445.26 us/op 428.31 us/op 1.04
set epochStatuses - ListTreeView - commit() 400.52 us/op 399.22 us/op 1.00
bitstring 638.90 ns/op 645.20 ns/op 0.99
bit mask 13.575 ns/op 13.595 ns/op 1.00
struct - increase slot to 1000000 927.68 us/op 928.22 us/op 1.00
UintNumberType - increase slot to 1000000 28.494 ms/op 28.482 ms/op 1.00
UintBigintType - increase slot to 1000000 396.51 ms/op 436.52 ms/op 0.91
UintBigint8 x 100000 tree_deserialize 5.0856 ms/op 4.7991 ms/op 1.06
UintBigint8 x 100000 tree_serialize 1.1865 ms/op 1.1908 ms/op 1.00
UintBigint16 x 100000 tree_deserialize 4.6953 ms/op 4.8965 ms/op 0.96
UintBigint16 x 100000 tree_serialize 1.1271 ms/op 1.2277 ms/op 0.92
UintBigint32 x 100000 tree_deserialize 4.7340 ms/op 4.9104 ms/op 0.96
UintBigint32 x 100000 tree_serialize 1.1904 ms/op 1.3207 ms/op 0.90
UintBigint64 x 100000 tree_deserialize 5.4564 ms/op 5.5974 ms/op 0.97
UintBigint64 x 100000 tree_serialize 1.5585 ms/op 1.6834 ms/op 0.93
UintBigint8 x 100000 value_deserialize 433.67 us/op 433.14 us/op 1.00
UintBigint8 x 100000 value_serialize 566.32 us/op 624.50 us/op 0.91
UintBigint16 x 100000 value_deserialize 464.10 us/op 469.52 us/op 0.99
UintBigint16 x 100000 value_serialize 602.98 us/op 665.25 us/op 0.91
UintBigint32 x 100000 value_deserialize 433.55 us/op 436.70 us/op 0.99
UintBigint32 x 100000 value_serialize 602.85 us/op 679.50 us/op 0.89
UintBigint64 x 100000 value_deserialize 468.03 us/op 466.75 us/op 1.00
UintBigint64 x 100000 value_serialize 780.83 us/op 873.46 us/op 0.89
UintBigint8 x 100000 deserialize 4.6647 ms/op 5.0061 ms/op 0.93
UintBigint8 x 100000 serialize 1.3983 ms/op 1.5161 ms/op 0.92
UintBigint16 x 100000 deserialize 4.5283 ms/op 4.9129 ms/op 0.92
UintBigint16 x 100000 serialize 1.4281 ms/op 1.5124 ms/op 0.94
UintBigint32 x 100000 deserialize 5.2401 ms/op 5.6090 ms/op 0.93
UintBigint32 x 100000 serialize 2.7495 ms/op 2.9267 ms/op 0.94
UintBigint64 x 100000 deserialize 3.6058 ms/op 3.9061 ms/op 0.92
UintBigint64 x 100000 serialize 1.5021 ms/op 1.4994 ms/op 1.00
UintBigint128 x 100000 deserialize 5.6700 ms/op 5.9063 ms/op 0.96
UintBigint128 x 100000 serialize 17.095 ms/op 18.068 ms/op 0.95
UintBigint256 x 100000 deserialize 10.691 ms/op 11.348 ms/op 0.94
UintBigint256 x 100000 serialize 49.981 ms/op 52.134 ms/op 0.96
Slice from Uint8Array x25000 996.58 us/op 1.0121 ms/op 0.98
Slice from ArrayBuffer x25000 16.574 ms/op 16.961 ms/op 0.98
Slice from ArrayBuffer x25000 + new Uint8Array 16.708 ms/op 18.343 ms/op 0.91
Copy Uint8Array 100000 iterate 788.05 us/op 805.99 us/op 0.98
Copy Uint8Array 100000 slice 85.438 us/op 90.656 us/op 0.94
Copy Uint8Array 100000 Uint8Array.prototype.slice.call 85.216 us/op 90.636 us/op 0.94
Copy Buffer 100000 Uint8Array.prototype.slice.call 85.889 us/op 90.549 us/op 0.95
Copy Uint8Array 100000 slice + set 139.15 us/op 155.83 us/op 0.89
Copy Uint8Array 100000 subarray + set 85.598 us/op 90.799 us/op 0.94
Copy Uint8Array 100000 slice arrayBuffer 85.959 us/op 90.931 us/op 0.95
Uint64 deserialize 100000 - iterate Uint8Array 1.7066 ms/op 1.7466 ms/op 0.98
Uint64 deserialize 100000 - by Uint32A 1.6871 ms/op 1.7571 ms/op 0.96
Uint64 deserialize 100000 - by DataView.getUint32 x2 1.6879 ms/op 1.7582 ms/op 0.96
Uint64 deserialize 100000 - by DataView.getBigUint64 4.8040 ms/op 5.2691 ms/op 0.91
Uint64 deserialize 100000 - by byte 65.006 ms/op 42.976 ms/op 1.51

by benchmarkbot/action

@github-actions github-actions bot added the CI label Feb 28, 2024
@twoeths twoeths marked this pull request as ready for review February 28, 2024 07:58
@twoeths twoeths requested a review from a team as a code owner February 28, 2024 07:58
@twoeths
Copy link
Contributor Author

twoeths commented Feb 28, 2024

the benchmark result in CI is not precise because it run with very short period of time, updated it to run at least 10s

setBenchOpts({
    minMs: 10000,
  });

Copy link

@jeluard jeluard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to store build artifacts in the repo?

packages/as-sha256/package.json Show resolved Hide resolved
packages/as-sha256/test/perf/index.test.ts Show resolved Hide resolved
packages/as-sha256/assembly/index.ts Outdated Show resolved Hide resolved
packages/as-sha256/assembly/index.ts Show resolved Hide resolved
@twoeths
Copy link
Contributor Author

twoeths commented Feb 28, 2024

Do we need to store build artifacts in the repo?

yes, they are actually input for our typescript function every time we run, that's why they are tracked in git

},
"devDependencies": {
"@as-pect/assembly": "2.8.1",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the file as-pect.config.js be removed too?

@jeluard
Copy link

jeluard commented Feb 28, 2024

Do we need to store build artifacts in the repo?

yes, they are actually input for our typescript function every time we run, that's why they are tracked in git

Is this what you are referring to? Looks like we only care about codegen being called to update https://github.com/ChainSafe/ssz/blob/master/packages/as-sha256/src/wasmCode.ts (as correctly done by this PR), but wasm/wat don't need to be in git?

Not part of this PR anyway, maybe something to be considered to improve the build process.

@twoeths
Copy link
Contributor Author

twoeths commented Mar 1, 2024

checked the performance on a separate server and got same result to my local environment

master feat4 mainnet

digestTwoHashObjects vs digest64 vs digest
    ✓ digestTwoHashObjects 50023 times                                    19.25844 ops/s    51.92528 ms/op        -         67 runs   3.98 s
    ✓ digest64 50023 times                                                20.23583 ops/s    49.41730 ms/op        -         12 runs   1.11 s
    ✓ digest 50023 times                                                  20.31116 ops/s    49.23402 ms/op        -         13 runs   1.19 s

this branch

digestTwoHashObjects vs digest64 vs digest
    ✓ digestTwoHashObjects 50023 times                                    21.91978 ops/s    45.62089 ms/op        -       1304 runs   60.0 s
    ✓ digest64 50023 times                                                21.11148 ops/s    47.36759 ms/op        -       1256 runs   60.0 s
    ✓ digest 50023 times                                                  20.90464 ops/s    47.83628 ms/op        -       1244 runs   60.0 s

however benchmark result in CI consistently shows worse statistics so need to investigate this, may need to separate to smaller PRs in order to figure out the issue

@twoeths twoeths marked this pull request as draft March 1, 2024 09:13
@twoeths twoeths closed this Mar 1, 2024
@twoeths twoeths deleted the tuyen/chain_w_computation branch May 25, 2024 01:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants