Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: implement IndexedGossipQueue #5803

Merged
merged 6 commits into from
Aug 14, 2023
Merged

Conversation

twoeths
Copy link
Contributor

@twoeths twoeths commented Jul 26, 2023

Motivation

Description

  • Separate to GossipQueue interface, the current implementation is LinearGossipQueue. This does not affect the current performance of node since noone is using the new gossip queue for now
  • Implement new IndexedGossipQueue
    • index items by indexFn using a map, in the case of attestation it's attestation data base64
    • store keys with at least minChunkSize
    • on next, pick the the last key with minChunkSize, pop up to maxChunkSize items
    • on delete, pick the 1st key in the map and delete the 1st item in the list
  • Implement some utility collections:
    • OrderedSet that's backed by LinkedList in order to get the first and last item
    • OrderedMap that's backed by OrderedSet in order to get first and last key/value

part of #5416

@github-actions
Copy link
Contributor

github-actions bot commented Jul 26, 2023

Performance Report

✔️ no performance regression detected

Full benchmark results
Benchmark suite Current: 4900853 Previous: 201dfc8 Ratio
getPubkeys - index2pubkey - req 1000 vs - 250000 vc 509.36 us/op 736.46 us/op 0.69
getPubkeys - validatorsArr - req 1000 vs - 250000 vc 80.874 us/op 76.952 us/op 1.05
BLS verify - blst-native 1.2169 ms/op 1.2060 ms/op 1.01
BLS verifyMultipleSignatures 3 - blst-native 2.4706 ms/op 2.4125 ms/op 1.02
BLS verifyMultipleSignatures 8 - blst-native 5.3045 ms/op 5.2197 ms/op 1.02
BLS verifyMultipleSignatures 32 - blst-native 19.124 ms/op 18.944 ms/op 1.01
BLS aggregatePubkeys 32 - blst-native 25.720 us/op 24.972 us/op 1.03
BLS aggregatePubkeys 128 - blst-native 101.35 us/op 97.634 us/op 1.04
getAttestationsForBlock 53.142 ms/op 50.705 ms/op 1.05
isKnown best case - 1 super set check 306.00 ns/op 281.00 ns/op 1.09
isKnown normal case - 2 super set checks 288.00 ns/op 279.00 ns/op 1.03
isKnown worse case - 16 super set checks 270.00 ns/op 273.00 ns/op 0.99
CheckpointStateCache - add get delete 5.5920 us/op 4.8780 us/op 1.15
validate api signedAggregateAndProof - struct 2.7885 ms/op 2.7777 ms/op 1.00
validate gossip signedAggregateAndProof - struct 2.8785 ms/op 2.7793 ms/op 1.04
validate api attestation - struct 1.3437 ms/op 1.2721 ms/op 1.06
validate gossip attestation - struct 1.3668 ms/op 1.3023 ms/op 1.05
pickEth1Vote - no votes 1.2384 ms/op 1.1581 ms/op 1.07
pickEth1Vote - max votes 11.443 ms/op 9.2487 ms/op 1.24
pickEth1Vote - Eth1Data hashTreeRoot value x2048 9.3019 ms/op 8.4711 ms/op 1.10
pickEth1Vote - Eth1Data hashTreeRoot tree x2048 16.321 ms/op 13.962 ms/op 1.17
pickEth1Vote - Eth1Data fastSerialize value x2048 625.22 us/op 681.61 us/op 0.92
pickEth1Vote - Eth1Data fastSerialize tree x2048 5.2332 ms/op 6.8071 ms/op 0.77
bytes32 toHexString 565.00 ns/op 474.00 ns/op 1.19
bytes32 Buffer.toString(hex) 304.00 ns/op 296.00 ns/op 1.03
bytes32 Buffer.toString(hex) from Uint8Array 509.00 ns/op 420.00 ns/op 1.21
bytes32 Buffer.toString(hex) + 0x 311.00 ns/op 304.00 ns/op 1.02
Object access 1 prop 0.16200 ns/op 0.15700 ns/op 1.03
Map access 1 prop 0.15400 ns/op 0.15600 ns/op 0.99
Object get x1000 7.7490 ns/op 7.3120 ns/op 1.06
Map get x1000 0.64800 ns/op 0.59200 ns/op 1.09
Object set x1000 52.372 ns/op 46.762 ns/op 1.12
Map set x1000 40.008 ns/op 36.795 ns/op 1.09
Return object 10000 times 0.24070 ns/op 0.22370 ns/op 1.08
Throw Error 10000 times 3.8630 us/op 3.6596 us/op 1.06
fastMsgIdFn sha256 / 200 bytes 3.3410 us/op 3.1150 us/op 1.07
fastMsgIdFn h32 xxhash / 200 bytes 311.00 ns/op 267.00 ns/op 1.16
fastMsgIdFn h64 xxhash / 200 bytes 358.00 ns/op 334.00 ns/op 1.07
fastMsgIdFn sha256 / 1000 bytes 11.417 us/op 10.813 us/op 1.06
fastMsgIdFn h32 xxhash / 1000 bytes 431.00 ns/op 393.00 ns/op 1.10
fastMsgIdFn h64 xxhash / 1000 bytes 420.00 ns/op 408.00 ns/op 1.03
fastMsgIdFn sha256 / 10000 bytes 106.56 us/op 99.007 us/op 1.08
fastMsgIdFn h32 xxhash / 10000 bytes 1.9760 us/op 1.7950 us/op 1.10
fastMsgIdFn h64 xxhash / 10000 bytes 1.3230 us/op 1.2430 us/op 1.06
enrSubnets - fastDeserialize 64 bits 1.3290 us/op 1.1700 us/op 1.14
enrSubnets - ssz BitVector 64 bits 514.00 ns/op 407.00 ns/op 1.26
enrSubnets - fastDeserialize 4 bits 223.00 ns/op 166.00 ns/op 1.34
enrSubnets - ssz BitVector 4 bits 483.00 ns/op 407.00 ns/op 1.19
prioritizePeers score -10:0 att 32-0.1 sync 2-0 109.69 us/op 98.573 us/op 1.11
prioritizePeers score 0:0 att 32-0.25 sync 2-0.25 135.28 us/op 124.61 us/op 1.09
prioritizePeers score 0:0 att 32-0.5 sync 2-0.5 193.96 us/op 154.11 us/op 1.26
prioritizePeers score 0:0 att 64-0.75 sync 4-0.75 337.64 us/op 274.80 us/op 1.23
prioritizePeers score 0:0 att 64-1 sync 4-1 377.48 us/op 328.72 us/op 1.15
array of 16000 items push then shift 1.5951 us/op 1.5440 us/op 1.03
LinkedList of 16000 items push then shift 9.0590 ns/op 8.7310 ns/op 1.04
array of 16000 items push then pop 60.044 ns/op 57.392 ns/op 1.05
LinkedList of 16000 items push then pop 9.0210 ns/op 8.5090 ns/op 1.06
array of 24000 items push then shift 2.4445 us/op 2.3634 us/op 1.03
LinkedList of 24000 items push then shift 9.3800 ns/op 8.8740 ns/op 1.06
array of 24000 items push then pop 117.17 ns/op 106.45 ns/op 1.10
LinkedList of 24000 items push then pop 9.2120 ns/op 8.6820 ns/op 1.06
intersect bitArray bitLen 8 6.9770 ns/op 6.8440 ns/op 1.02
intersect array and set length 8 81.148 ns/op 61.103 ns/op 1.33
intersect bitArray bitLen 128 32.197 ns/op 32.333 ns/op 1.00
intersect array and set length 128 986.16 ns/op 804.44 ns/op 1.23
bitArray.getTrueBitIndexes() bitLen 128 1.5560 us/op 1.6500 us/op 0.94
bitArray.getTrueBitIndexes() bitLen 248 2.8000 us/op 2.6290 us/op 1.07
bitArray.getTrueBitIndexes() bitLen 512 5.7600 us/op 5.0300 us/op 1.15
Buffer.concat 32 items 1.1150 us/op 977.00 ns/op 1.14
Uint8Array.set 32 items 1.8140 us/op 1.7740 us/op 1.02
Set add up to 64 items then delete first 4.4101 us/op
OrderedSet add up to 64 items then delete first 5.5805 us/op
Set add up to 64 items then delete last 4.7735 us/op
OrderedSet add up to 64 items then delete last 6.2522 us/op
Set add up to 64 items then delete middle 4.8031 us/op
OrderedSet add up to 64 items then delete middle 7.5334 us/op
Set add up to 128 items then delete first 9.4072 us/op
OrderedSet add up to 128 items then delete first 12.477 us/op
Set add up to 128 items then delete last 9.7780 us/op
OrderedSet add up to 128 items then delete last 13.393 us/op
Set add up to 128 items then delete middle 9.6546 us/op
OrderedSet add up to 128 items then delete middle 17.914 us/op
Set add up to 256 items then delete first 19.113 us/op
OrderedSet add up to 256 items then delete first 24.016 us/op
Set add up to 256 items then delete last 19.604 us/op
OrderedSet add up to 256 items then delete last 26.339 us/op
Set add up to 256 items then delete middle 19.561 us/op
OrderedSet add up to 256 items then delete middle 47.043 us/op
transfer serialized Status (84 B) 1.8920 us/op 1.8300 us/op 1.03
copy serialized Status (84 B) 1.6210 us/op 1.5190 us/op 1.07
transfer serialized SignedVoluntaryExit (112 B) 2.0100 us/op 1.9490 us/op 1.03
copy serialized SignedVoluntaryExit (112 B) 1.6560 us/op 1.5780 us/op 1.05
transfer serialized ProposerSlashing (416 B) 2.5200 us/op 2.2210 us/op 1.13
copy serialized ProposerSlashing (416 B) 2.2390 us/op 1.7850 us/op 1.25
transfer serialized Attestation (485 B) 2.9380 us/op 1.9960 us/op 1.47
copy serialized Attestation (485 B) 2.1610 us/op 1.7470 us/op 1.24
transfer serialized AttesterSlashing (33232 B) 2.2670 us/op 2.0960 us/op 1.08
copy serialized AttesterSlashing (33232 B) 6.1290 us/op 4.8500 us/op 1.26
transfer serialized Small SignedBeaconBlock (128000 B) 2.9970 us/op 2.3500 us/op 1.28
copy serialized Small SignedBeaconBlock (128000 B) 17.305 us/op 13.614 us/op 1.27
transfer serialized Avg SignedBeaconBlock (200000 B) 3.4340 us/op 2.7240 us/op 1.26
copy serialized Avg SignedBeaconBlock (200000 B) 26.508 us/op 22.651 us/op 1.17
transfer serialized BlobsSidecar (524380 B) 3.4530 us/op 2.7100 us/op 1.27
copy serialized BlobsSidecar (524380 B) 158.30 us/op 123.40 us/op 1.28
transfer serialized Big SignedBeaconBlock (1000000 B) 4.3460 us/op 3.4040 us/op 1.28
copy serialized Big SignedBeaconBlock (1000000 B) 179.84 us/op 163.88 us/op 1.10
pass gossip attestations to forkchoice per slot 2.1714 ms/op 2.1666 ms/op 1.00
forkChoice updateHead vc 100000 bc 64 eq 0 2.1509 ms/op 2.1244 ms/op 1.01
forkChoice updateHead vc 600000 bc 64 eq 0 13.914 ms/op 14.343 ms/op 0.97
forkChoice updateHead vc 1000000 bc 64 eq 0 18.405 ms/op 23.352 ms/op 0.79
forkChoice updateHead vc 600000 bc 320 eq 0 18.512 ms/op 17.349 ms/op 1.07
forkChoice updateHead vc 600000 bc 1200 eq 0 87.638 ms/op 88.877 ms/op 0.99
forkChoice updateHead vc 600000 bc 64 eq 1000 19.131 ms/op 22.864 ms/op 0.84
forkChoice updateHead vc 600000 bc 64 eq 10000 21.093 ms/op 24.458 ms/op 0.86
forkChoice updateHead vc 600000 bc 64 eq 300000 28.443 ms/op 30.592 ms/op 0.93
computeDeltas 3.0130 ms/op 3.2318 ms/op 0.93
computeProposerBoostScoreFromBalances 385.89 us/op 392.91 us/op 0.98
altair processAttestation - 250000 vs - 7PWei normalcase 2.0808 ms/op 2.6320 ms/op 0.79
altair processAttestation - 250000 vs - 7PWei worstcase 3.1435 ms/op 4.2532 ms/op 0.74
altair processAttestation - setStatus - 1/6 committees join 234.40 us/op 196.44 us/op 1.19
altair processAttestation - setStatus - 1/3 committees join 411.94 us/op 355.23 us/op 1.16
altair processAttestation - setStatus - 1/2 committees join 612.17 us/op 475.73 us/op 1.29
altair processAttestation - setStatus - 2/3 committees join 716.88 us/op 605.79 us/op 1.18
altair processAttestation - setStatus - 4/5 committees join 1.0364 ms/op 853.38 us/op 1.21
altair processAttestation - setStatus - 100% committees join 1.1965 ms/op 997.52 us/op 1.20
altair processBlock - 250000 vs - 7PWei normalcase 10.417 ms/op 11.037 ms/op 0.94
altair processBlock - 250000 vs - 7PWei normalcase hashState 17.506 ms/op 18.173 ms/op 0.96
altair processBlock - 250000 vs - 7PWei worstcase 39.935 ms/op 39.992 ms/op 1.00
altair processBlock - 250000 vs - 7PWei worstcase hashState 59.509 ms/op 63.599 ms/op 0.94
phase0 processBlock - 250000 vs - 7PWei normalcase 2.0989 ms/op 2.6827 ms/op 0.78
phase0 processBlock - 250000 vs - 7PWei worstcase 29.824 ms/op 31.686 ms/op 0.94
altair processEth1Data - 250000 vs - 7PWei normalcase 601.58 us/op 536.23 us/op 1.12
getExpectedWithdrawals 250000 eb:1,eth1:1,we:0,wn:0,smpl:15 14.197 us/op 10.017 us/op 1.42
getExpectedWithdrawals 250000 eb:0.95,eth1:0.1,we:0.05,wn:0,smpl:219 72.257 us/op 84.141 us/op 0.86
getExpectedWithdrawals 250000 eb:0.95,eth1:0.3,we:0.05,wn:0,smpl:42 24.308 us/op 26.863 us/op 0.90
getExpectedWithdrawals 250000 eb:0.95,eth1:0.7,we:0.05,wn:0,smpl:18 20.897 us/op 15.207 us/op 1.37
getExpectedWithdrawals 250000 eb:0.1,eth1:0.1,we:0,wn:0,smpl:1020 209.60 us/op 166.59 us/op 1.26
getExpectedWithdrawals 250000 eb:0.03,eth1:0.03,we:0,wn:0,smpl:11777 1.5207 ms/op 1.5911 ms/op 0.96
getExpectedWithdrawals 250000 eb:0.01,eth1:0.01,we:0,wn:0,smpl:16384 1.9564 ms/op 1.8075 ms/op 1.08
getExpectedWithdrawals 250000 eb:0,eth1:0,we:0,wn:0,smpl:16384 1.6187 ms/op 2.1619 ms/op 0.75
getExpectedWithdrawals 250000 eb:0,eth1:0,we:0,wn:0,nocache,smpl:16384 3.4572 ms/op 4.5320 ms/op 0.76
getExpectedWithdrawals 250000 eb:0,eth1:1,we:0,wn:0,smpl:16384 3.0295 ms/op 2.7610 ms/op 1.10
getExpectedWithdrawals 250000 eb:0,eth1:1,we:0,wn:0,nocache,smpl:16384 6.4581 ms/op 6.7336 ms/op 0.96
Tree 40 250000 create 385.25 ms/op 412.15 ms/op 0.93
Tree 40 250000 get(125000) 211.63 ns/op 214.60 ns/op 0.99
Tree 40 250000 set(125000) 1.0028 us/op 1.0545 us/op 0.95
Tree 40 250000 toArray() 22.571 ms/op 21.457 ms/op 1.05
Tree 40 250000 iterate all - toArray() + loop 23.405 ms/op 23.558 ms/op 0.99
Tree 40 250000 iterate all - get(i) 75.678 ms/op 74.195 ms/op 1.02
MutableVector 250000 create 10.327 ms/op 10.088 ms/op 1.02
MutableVector 250000 get(125000) 6.6400 ns/op 6.5350 ns/op 1.02
MutableVector 250000 set(125000) 302.82 ns/op 288.00 ns/op 1.05
MutableVector 250000 toArray() 2.8129 ms/op 3.9755 ms/op 0.71
MutableVector 250000 iterate all - toArray() + loop 2.9255 ms/op 4.1078 ms/op 0.71
MutableVector 250000 iterate all - get(i) 1.5354 ms/op 1.6852 ms/op 0.91
Array 250000 create 2.5901 ms/op 3.2554 ms/op 0.80
Array 250000 clone - spread 1.2039 ms/op 1.0297 ms/op 1.17
Array 250000 get(125000) 0.60100 ns/op 0.50300 ns/op 1.19
Array 250000 set(125000) 0.66500 ns/op 0.58500 ns/op 1.14
Array 250000 iterate all - loop 83.265 us/op 113.17 us/op 0.74
effectiveBalanceIncrements clone Uint8Array 300000 33.135 us/op 23.943 us/op 1.38
effectiveBalanceIncrements clone MutableVector 300000 362.00 ns/op 268.00 ns/op 1.35
effectiveBalanceIncrements rw all Uint8Array 300000 177.09 us/op 182.05 us/op 0.97
effectiveBalanceIncrements rw all MutableVector 300000 86.490 ms/op 76.216 ms/op 1.13
phase0 afterProcessEpoch - 250000 vs - 7PWei 112.31 ms/op 116.38 ms/op 0.96
phase0 beforeProcessEpoch - 250000 vs - 7PWei 38.754 ms/op 30.847 ms/op 1.26
altair processEpoch - mainnet_e81889 327.48 ms/op 313.97 ms/op 1.04
mainnet_e81889 - altair beforeProcessEpoch 67.454 ms/op 58.640 ms/op 1.15
mainnet_e81889 - altair processJustificationAndFinalization 13.680 us/op 15.578 us/op 0.88
mainnet_e81889 - altair processInactivityUpdates 5.9567 ms/op 5.0689 ms/op 1.18
mainnet_e81889 - altair processRewardsAndPenalties 52.787 ms/op 66.662 ms/op 0.79
mainnet_e81889 - altair processRegistryUpdates 2.8950 us/op 2.5930 us/op 1.12
mainnet_e81889 - altair processSlashings 605.00 ns/op 443.00 ns/op 1.37
mainnet_e81889 - altair processEth1DataReset 650.00 ns/op 539.00 ns/op 1.21
mainnet_e81889 - altair processEffectiveBalanceUpdates 1.2449 ms/op 1.2514 ms/op 0.99
mainnet_e81889 - altair processSlashingsReset 3.6820 us/op 2.6530 us/op 1.39
mainnet_e81889 - altair processRandaoMixesReset 5.3490 us/op 4.4260 us/op 1.21
mainnet_e81889 - altair processHistoricalRootsUpdate 635.00 ns/op 710.00 ns/op 0.89
mainnet_e81889 - altair processParticipationFlagUpdates 1.9070 us/op 2.9580 us/op 0.64
mainnet_e81889 - altair processSyncCommitteeUpdates 1.1560 us/op 569.00 ns/op 2.03
mainnet_e81889 - altair afterProcessEpoch 124.22 ms/op 127.27 ms/op 0.98
phase0 processEpoch - mainnet_e58758 359.55 ms/op 359.90 ms/op 1.00
mainnet_e58758 - phase0 beforeProcessEpoch 144.58 ms/op 129.91 ms/op 1.11
mainnet_e58758 - phase0 processJustificationAndFinalization 15.285 us/op 13.882 us/op 1.10
mainnet_e58758 - phase0 processRewardsAndPenalties 66.987 ms/op 66.066 ms/op 1.01
mainnet_e58758 - phase0 processRegistryUpdates 10.370 us/op 11.076 us/op 0.94
mainnet_e58758 - phase0 processSlashings 534.00 ns/op 466.00 ns/op 1.15
mainnet_e58758 - phase0 processEth1DataReset 1.0510 us/op 458.00 ns/op 2.29
mainnet_e58758 - phase0 processEffectiveBalanceUpdates 1.9823 ms/op 1.3583 ms/op 1.46
mainnet_e58758 - phase0 processSlashingsReset 3.1010 us/op 2.0530 us/op 1.51
mainnet_e58758 - phase0 processRandaoMixesReset 4.0110 us/op 4.6770 us/op 0.86
mainnet_e58758 - phase0 processHistoricalRootsUpdate 531.00 ns/op 539.00 ns/op 0.99
mainnet_e58758 - phase0 processParticipationRecordUpdates 3.7220 us/op 2.9920 us/op 1.24
mainnet_e58758 - phase0 afterProcessEpoch 98.838 ms/op 97.915 ms/op 1.01
phase0 processEffectiveBalanceUpdates - 250000 normalcase 1.2364 ms/op 1.2475 ms/op 0.99
phase0 processEffectiveBalanceUpdates - 250000 worstcase 0.5 1.4387 ms/op 1.4011 ms/op 1.03
altair processInactivityUpdates - 250000 normalcase 18.932 ms/op 24.568 ms/op 0.77
altair processInactivityUpdates - 250000 worstcase 26.551 ms/op 23.642 ms/op 1.12
phase0 processRegistryUpdates - 250000 normalcase 10.800 us/op 10.271 us/op 1.05
phase0 processRegistryUpdates - 250000 badcase_full_deposits 353.67 us/op 380.47 us/op 0.93
phase0 processRegistryUpdates - 250000 worstcase 0.5 119.31 ms/op 133.40 ms/op 0.89
altair processRewardsAndPenalties - 250000 normalcase 70.228 ms/op 69.278 ms/op 1.01
altair processRewardsAndPenalties - 250000 worstcase 72.315 ms/op 71.947 ms/op 1.01
phase0 getAttestationDeltas - 250000 normalcase 7.8342 ms/op 8.2133 ms/op 0.95
phase0 getAttestationDeltas - 250000 worstcase 7.6326 ms/op 8.9916 ms/op 0.85
phase0 processSlashings - 250000 worstcase 2.2641 ms/op 2.4953 ms/op 0.91
altair processSyncCommitteeUpdates - 250000 148.92 ms/op 173.22 ms/op 0.86
BeaconState.hashTreeRoot - No change 279.00 ns/op 327.00 ns/op 0.85
BeaconState.hashTreeRoot - 1 full validator 49.467 us/op 54.734 us/op 0.90
BeaconState.hashTreeRoot - 32 full validator 486.64 us/op 617.30 us/op 0.79
BeaconState.hashTreeRoot - 512 full validator 5.0513 ms/op 5.6991 ms/op 0.89
BeaconState.hashTreeRoot - 1 validator.effectiveBalance 60.863 us/op 68.706 us/op 0.89
BeaconState.hashTreeRoot - 32 validator.effectiveBalance 823.37 us/op 1.0228 ms/op 0.80
BeaconState.hashTreeRoot - 512 validator.effectiveBalance 10.231 ms/op 13.660 ms/op 0.75
BeaconState.hashTreeRoot - 1 balances 46.781 us/op 56.762 us/op 0.82
BeaconState.hashTreeRoot - 32 balances 427.20 us/op 473.04 us/op 0.90
BeaconState.hashTreeRoot - 512 balances 3.9184 ms/op 5.1815 ms/op 0.76
BeaconState.hashTreeRoot - 250000 balances 69.718 ms/op 80.260 ms/op 0.87
aggregationBits - 2048 els - zipIndexesInBitList 17.621 us/op 20.062 us/op 0.88
regular array get 100000 times 33.022 us/op 45.652 us/op 0.72
wrappedArray get 100000 times 33.297 us/op 34.819 us/op 0.96
arrayWithProxy get 100000 times 15.967 ms/op 15.525 ms/op 1.03
ssz.Root.equals 238.00 ns/op 255.00 ns/op 0.93
byteArrayEquals 243.00 ns/op 262.00 ns/op 0.93
shuffle list - 16384 els 7.2120 ms/op 7.3421 ms/op 0.98
shuffle list - 250000 els 102.86 ms/op 107.97 ms/op 0.95
processSlot - 1 slots 8.0300 us/op 9.5040 us/op 0.84
processSlot - 32 slots 1.3888 ms/op 1.4373 ms/op 0.97
getEffectiveBalanceIncrementsZeroInactive - 250000 vs - 7PWei 55.992 ms/op 59.367 ms/op 0.94
getCommitteeAssignments - req 1 vs - 250000 vc 2.5298 ms/op 2.5965 ms/op 0.97
getCommitteeAssignments - req 100 vs - 250000 vc 3.7406 ms/op 3.8568 ms/op 0.97
getCommitteeAssignments - req 1000 vs - 250000 vc 4.0965 ms/op 4.2408 ms/op 0.97
RootCache.getBlockRootAtSlot - 250000 vs - 7PWei 5.2200 ns/op 5.2800 ns/op 0.99
state getBlockRootAtSlot - 250000 vs - 7PWei 565.03 ns/op 636.85 ns/op 0.89
computeProposers - vc 250000 10.013 ms/op 9.3891 ms/op 1.07
computeEpochShuffling - vc 250000 107.63 ms/op 107.68 ms/op 1.00
getNextSyncCommittee - vc 250000 162.35 ms/op 156.90 ms/op 1.03
computeSigningRoot for AttestationData 14.000 us/op 13.680 us/op 1.02
hash AttestationData serialized data then Buffer.toString(base64) 2.3614 us/op 2.4098 us/op 0.98
toHexString serialized data 1.0773 us/op 1.0786 us/op 1.00
Buffer.toString(base64) 228.20 ns/op 215.32 ns/op 1.06

by benchmarkbot/action

@twoeths twoeths marked this pull request as ready for review July 26, 2023 07:03
@twoeths twoeths requested a review from a team as a code owner July 26, 2023 07:03
* If not, pick the last key
*/
next(): T[] | null {
let key: string | null = this.minChunkSizeKeys.last();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this approach is best, consider this case:

  • a key receives minChunkSize-1 items at time T1 and then 1 extra item at time T2
  • It is appended to minChunkSizeKeys at time T2 and will be picked up first, but most of its items are at time T1

Similarly if a key does not receive minChunkSize it will always be de-prioritized. This approach risks respecting FIFO for some load patterns.

The goals for a FIFO queue with indexing that intends to batch should be:

  1. Prioritize latest items
  2. Buffer items to allow batching, but limit buffering by max wait or max length

For 1. in batches, what decide the timeliness of a batch?

  • First arrival
  • Latest arrival
  • Avg of items

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this is not ideal. Initially I did like this:

  • always pick the last item (like LIFO)
  • process it along with min(64, number of messages with same data/key)

the performance is far from the current approach since it can't batch much messages. I think it's not easy to pick a strategy, we need to:

  • define some criterias to pick a strategy, one is: job wait time + job time + avg(forwarded messages)
  • do some dynamic configurations here, configure it at runtime and compare

@dapplion
Copy link
Contributor

dapplion commented Jul 26, 2023

It's not easy to enforce FIFO and fairness rules having multiple list of objects that arrive at different times. We do FIFO for attestations because the most recent attestations have the most value to us.

Given a list of single-bit gossip attestations (assuming all valid) their value is equal the sum of 1 / incl_delay. Prioritizing each list by first or last arrival time ignores the count of attestations a list have and could then, ignore big lists of not recent attestations.

One approach that would be fair but maybe expensive is to compute this "score" everytime

type MsgList = {
  avgRecvTimestampMs: number;
  items: GossipMsg[];
};
type GossipMsgId = string;
type GossipMsg = any;

const msgListScore = (a: MsgList): number => a.items.length / Math.max(1000, Date.now() - a.avgRecvTimestampMs);
const getKey = (_: GossipMsg): GossipMsgId => "TODO";

class Queue {
  queuesMap: Map<GossipMsgId, MsgList>;
  queuesList: MsgList[];

  add(item: GossipMsg): void {
    const key = getKey(item);
    let queue = this.queuesMap.get(key);

    if (!queue) {
      queue = {avgRecvTimestampMs: Date.now(), items: [item]};
      this.queuesMap.set(key, queue);
    } else {
      queue.items.push(item);
      // Compute average adding 1 value: avg = (avg * N + x) / (N + 1)
      queue.avgRecvTimestampMs =
        (queue.items.length * queue.avgRecvTimestampMs + Date.now()) / (queue.items.length + 1);
    }
  }

  getNext(): GossipMsg[] | undefined {
    // Sort in place by score, can be optimized runing this sometimes only
    this.queuesList.sort((a, b) => msgListScore(a) - msgListScore(b));

    return this.queuesList.shift()?.items;
  }
}

@twoeths
Copy link
Contributor Author

twoeths commented Aug 2, 2023

One approach that would be fair but maybe expensive is to compute this "score" everytime

@dapplion yes that's expensive, tested the IndexedGossipQueueAvgTime in the last 2h

Screenshot 2023-08-02 at 16 14 50

@dapplion
Copy link
Contributor

dapplion commented Aug 3, 2023

One approach that would be fair but maybe expensive is to compute this "score" everytime

@dapplion yes that's expensive, tested the IndexedGossipQueueAvgTime in the last 2h
Screenshot 2023-08-02 at 16 14 50

please @tuyennhv do not post spiky graphs, post either histograms or % of values above a certain threshold, or an avg_over_time

@dapplion
Copy link
Contributor

dapplion commented Aug 9, 2023

The queue must have some mechanism to pick up groups of attestation datas that do not reach the minimum count.

@twoeths
Copy link
Contributor Author

twoeths commented Aug 11, 2023

the current mechanism works as below:

attDataHex0: 30 items
attDataHex1: 35 items
attDataHex2: 1 items
attDataHex3: 2 items
attDataHex4: 2 item

with minChunkSize=32 it'll pick 35 items of attDatHex1, then 2 items of attDataHex4, then 2 items of attDataHex3, then 1 item of attDataHex2 then 30 items of attDataHex0. If after we process attDataHex1, attDataHex${n} has more than 32 items then we'd process it next, the delay is worth because we batch more

The queue must have some mechanism to pick up groups of attestation datas that do not reach the minimum count.

@dapplion what do you expect, or want to improve with the above example? I can give it a try in a real node

@dapplion
Copy link
Contributor

Let's go with this and have histogram metrics to measure delay in the queue since first item

@twoeths twoeths merged commit 5edda2b into unstable Aug 14, 2023
11 checks passed
@twoeths twoeths deleted the tuyen/indexed_gossip_queue branch August 14, 2023 01:09
@wemeetagain
Copy link
Member

🎉 This PR is included in v1.11.0 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants