feat: implement IndexedGossipQueue #5803

twoeths · 2023-07-26T06:15:40Z

Motivation

We want to process attestations of the same data in batch so we want to group attestations of the same data by attestation data base64
extracted from feat: verify attestation gossip messages in batch #5729

Description

Separate to GossipQueue interface, the current implementation is LinearGossipQueue. This does not affect the current performance of node since noone is using the new gossip queue for now
Implement new IndexedGossipQueue
- index items by indexFn using a map, in the case of attestation it's attestation data base64
- store keys with at least minChunkSize
- on next, pick the the last key with minChunkSize, pop up to maxChunkSize items
- on delete, pick the 1st key in the map and delete the 1st item in the list
Implement some utility collections:
- OrderedSet that's backed by LinkedList in order to get the first and last item
- OrderedMap that's backed by OrderedSet in order to get first and last key/value

part of #5416

github-actions · 2023-07-26T06:40:28Z

Performance Report

✔️ no performance regression detected

Full benchmark results

Benchmark suite	Current: `4900853`	Previous: `201dfc8`	Ratio
getPubkeys - index2pubkey - req 1000 vs - 250000 vc	509.36 us/op	736.46 us/op	0.69
getPubkeys - validatorsArr - req 1000 vs - 250000 vc	80.874 us/op	76.952 us/op	1.05
BLS verify - blst-native	1.2169 ms/op	1.2060 ms/op	1.01
BLS verifyMultipleSignatures 3 - blst-native	2.4706 ms/op	2.4125 ms/op	1.02
BLS verifyMultipleSignatures 8 - blst-native	5.3045 ms/op	5.2197 ms/op	1.02
BLS verifyMultipleSignatures 32 - blst-native	19.124 ms/op	18.944 ms/op	1.01
BLS aggregatePubkeys 32 - blst-native	25.720 us/op	24.972 us/op	1.03
BLS aggregatePubkeys 128 - blst-native	101.35 us/op	97.634 us/op	1.04
getAttestationsForBlock	53.142 ms/op	50.705 ms/op	1.05
isKnown best case - 1 super set check	306.00 ns/op	281.00 ns/op	1.09
isKnown normal case - 2 super set checks	288.00 ns/op	279.00 ns/op	1.03
isKnown worse case - 16 super set checks	270.00 ns/op	273.00 ns/op	0.99
CheckpointStateCache - add get delete	5.5920 us/op	4.8780 us/op	1.15
validate api signedAggregateAndProof - struct	2.7885 ms/op	2.7777 ms/op	1.00
validate gossip signedAggregateAndProof - struct	2.8785 ms/op	2.7793 ms/op	1.04
validate api attestation - struct	1.3437 ms/op	1.2721 ms/op	1.06
validate gossip attestation - struct	1.3668 ms/op	1.3023 ms/op	1.05
pickEth1Vote - no votes	1.2384 ms/op	1.1581 ms/op	1.07
pickEth1Vote - max votes	11.443 ms/op	9.2487 ms/op	1.24
pickEth1Vote - Eth1Data hashTreeRoot value x2048	9.3019 ms/op	8.4711 ms/op	1.10
pickEth1Vote - Eth1Data hashTreeRoot tree x2048	16.321 ms/op	13.962 ms/op	1.17
pickEth1Vote - Eth1Data fastSerialize value x2048	625.22 us/op	681.61 us/op	0.92
pickEth1Vote - Eth1Data fastSerialize tree x2048	5.2332 ms/op	6.8071 ms/op	0.77
bytes32 toHexString	565.00 ns/op	474.00 ns/op	1.19
bytes32 Buffer.toString(hex)	304.00 ns/op	296.00 ns/op	1.03
bytes32 Buffer.toString(hex) from Uint8Array	509.00 ns/op	420.00 ns/op	1.21
bytes32 Buffer.toString(hex) + 0x	311.00 ns/op	304.00 ns/op	1.02
Object access 1 prop	0.16200 ns/op	0.15700 ns/op	1.03
Map access 1 prop	0.15400 ns/op	0.15600 ns/op	0.99
Object get x1000	7.7490 ns/op	7.3120 ns/op	1.06
Map get x1000	0.64800 ns/op	0.59200 ns/op	1.09
Object set x1000	52.372 ns/op	46.762 ns/op	1.12
Map set x1000	40.008 ns/op	36.795 ns/op	1.09
Return object 10000 times	0.24070 ns/op	0.22370 ns/op	1.08
Throw Error 10000 times	3.8630 us/op	3.6596 us/op	1.06
fastMsgIdFn sha256 / 200 bytes	3.3410 us/op	3.1150 us/op	1.07
fastMsgIdFn h32 xxhash / 200 bytes	311.00 ns/op	267.00 ns/op	1.16
fastMsgIdFn h64 xxhash / 200 bytes	358.00 ns/op	334.00 ns/op	1.07
fastMsgIdFn sha256 / 1000 bytes	11.417 us/op	10.813 us/op	1.06
fastMsgIdFn h32 xxhash / 1000 bytes	431.00 ns/op	393.00 ns/op	1.10
fastMsgIdFn h64 xxhash / 1000 bytes	420.00 ns/op	408.00 ns/op	1.03
fastMsgIdFn sha256 / 10000 bytes	106.56 us/op	99.007 us/op	1.08
fastMsgIdFn h32 xxhash / 10000 bytes	1.9760 us/op	1.7950 us/op	1.10
fastMsgIdFn h64 xxhash / 10000 bytes	1.3230 us/op	1.2430 us/op	1.06
enrSubnets - fastDeserialize 64 bits	1.3290 us/op	1.1700 us/op	1.14
enrSubnets - ssz BitVector 64 bits	514.00 ns/op	407.00 ns/op	1.26
enrSubnets - fastDeserialize 4 bits	223.00 ns/op	166.00 ns/op	1.34
enrSubnets - ssz BitVector 4 bits	483.00 ns/op	407.00 ns/op	1.19
prioritizePeers score -10:0 att 32-0.1 sync 2-0	109.69 us/op	98.573 us/op	1.11
prioritizePeers score 0:0 att 32-0.25 sync 2-0.25	135.28 us/op	124.61 us/op	1.09
prioritizePeers score 0:0 att 32-0.5 sync 2-0.5	193.96 us/op	154.11 us/op	1.26
prioritizePeers score 0:0 att 64-0.75 sync 4-0.75	337.64 us/op	274.80 us/op	1.23
prioritizePeers score 0:0 att 64-1 sync 4-1	377.48 us/op	328.72 us/op	1.15
array of 16000 items push then shift	1.5951 us/op	1.5440 us/op	1.03
LinkedList of 16000 items push then shift	9.0590 ns/op	8.7310 ns/op	1.04
array of 16000 items push then pop	60.044 ns/op	57.392 ns/op	1.05
LinkedList of 16000 items push then pop	9.0210 ns/op	8.5090 ns/op	1.06
array of 24000 items push then shift	2.4445 us/op	2.3634 us/op	1.03
LinkedList of 24000 items push then shift	9.3800 ns/op	8.8740 ns/op	1.06
array of 24000 items push then pop	117.17 ns/op	106.45 ns/op	1.10
LinkedList of 24000 items push then pop	9.2120 ns/op	8.6820 ns/op	1.06
intersect bitArray bitLen 8	6.9770 ns/op	6.8440 ns/op	1.02
intersect array and set length 8	81.148 ns/op	61.103 ns/op	1.33
intersect bitArray bitLen 128	32.197 ns/op	32.333 ns/op	1.00
intersect array and set length 128	986.16 ns/op	804.44 ns/op	1.23
bitArray.getTrueBitIndexes() bitLen 128	1.5560 us/op	1.6500 us/op	0.94
bitArray.getTrueBitIndexes() bitLen 248	2.8000 us/op	2.6290 us/op	1.07
bitArray.getTrueBitIndexes() bitLen 512	5.7600 us/op	5.0300 us/op	1.15
Buffer.concat 32 items	1.1150 us/op	977.00 ns/op	1.14
Uint8Array.set 32 items	1.8140 us/op	1.7740 us/op	1.02
Set add up to 64 items then delete first	4.4101 us/op
OrderedSet add up to 64 items then delete first	5.5805 us/op
Set add up to 64 items then delete last	4.7735 us/op
OrderedSet add up to 64 items then delete last	6.2522 us/op
Set add up to 64 items then delete middle	4.8031 us/op
OrderedSet add up to 64 items then delete middle	7.5334 us/op
Set add up to 128 items then delete first	9.4072 us/op
OrderedSet add up to 128 items then delete first	12.477 us/op
Set add up to 128 items then delete last	9.7780 us/op
OrderedSet add up to 128 items then delete last	13.393 us/op
Set add up to 128 items then delete middle	9.6546 us/op
OrderedSet add up to 128 items then delete middle	17.914 us/op
Set add up to 256 items then delete first	19.113 us/op
OrderedSet add up to 256 items then delete first	24.016 us/op
Set add up to 256 items then delete last	19.604 us/op
OrderedSet add up to 256 items then delete last	26.339 us/op
Set add up to 256 items then delete middle	19.561 us/op
OrderedSet add up to 256 items then delete middle	47.043 us/op
transfer serialized Status (84 B)	1.8920 us/op	1.8300 us/op	1.03
copy serialized Status (84 B)	1.6210 us/op	1.5190 us/op	1.07
transfer serialized SignedVoluntaryExit (112 B)	2.0100 us/op	1.9490 us/op	1.03
copy serialized SignedVoluntaryExit (112 B)	1.6560 us/op	1.5780 us/op	1.05
transfer serialized ProposerSlashing (416 B)	2.5200 us/op	2.2210 us/op	1.13
copy serialized ProposerSlashing (416 B)	2.2390 us/op	1.7850 us/op	1.25
transfer serialized Attestation (485 B)	2.9380 us/op	1.9960 us/op	1.47
copy serialized Attestation (485 B)	2.1610 us/op	1.7470 us/op	1.24
transfer serialized AttesterSlashing (33232 B)	2.2670 us/op	2.0960 us/op	1.08
copy serialized AttesterSlashing (33232 B)	6.1290 us/op	4.8500 us/op	1.26
transfer serialized Small SignedBeaconBlock (128000 B)	2.9970 us/op	2.3500 us/op	1.28
copy serialized Small SignedBeaconBlock (128000 B)	17.305 us/op	13.614 us/op	1.27
transfer serialized Avg SignedBeaconBlock (200000 B)	3.4340 us/op	2.7240 us/op	1.26
copy serialized Avg SignedBeaconBlock (200000 B)	26.508 us/op	22.651 us/op	1.17
transfer serialized BlobsSidecar (524380 B)	3.4530 us/op	2.7100 us/op	1.27
copy serialized BlobsSidecar (524380 B)	158.30 us/op	123.40 us/op	1.28
transfer serialized Big SignedBeaconBlock (1000000 B)	4.3460 us/op	3.4040 us/op	1.28
copy serialized Big SignedBeaconBlock (1000000 B)	179.84 us/op	163.88 us/op	1.10
pass gossip attestations to forkchoice per slot	2.1714 ms/op	2.1666 ms/op	1.00
forkChoice updateHead vc 100000 bc 64 eq 0	2.1509 ms/op	2.1244 ms/op	1.01
forkChoice updateHead vc 600000 bc 64 eq 0	13.914 ms/op	14.343 ms/op	0.97
forkChoice updateHead vc 1000000 bc 64 eq 0	18.405 ms/op	23.352 ms/op	0.79
forkChoice updateHead vc 600000 bc 320 eq 0	18.512 ms/op	17.349 ms/op	1.07
forkChoice updateHead vc 600000 bc 1200 eq 0	87.638 ms/op	88.877 ms/op	0.99
forkChoice updateHead vc 600000 bc 64 eq 1000	19.131 ms/op	22.864 ms/op	0.84
forkChoice updateHead vc 600000 bc 64 eq 10000	21.093 ms/op	24.458 ms/op	0.86
forkChoice updateHead vc 600000 bc 64 eq 300000	28.443 ms/op	30.592 ms/op	0.93
computeDeltas	3.0130 ms/op	3.2318 ms/op	0.93
computeProposerBoostScoreFromBalances	385.89 us/op	392.91 us/op	0.98
altair processAttestation - 250000 vs - 7PWei normalcase	2.0808 ms/op	2.6320 ms/op	0.79
altair processAttestation - 250000 vs - 7PWei worstcase	3.1435 ms/op	4.2532 ms/op	0.74
altair processAttestation - setStatus - 1/6 committees join	234.40 us/op	196.44 us/op	1.19
altair processAttestation - setStatus - 1/3 committees join	411.94 us/op	355.23 us/op	1.16
altair processAttestation - setStatus - 1/2 committees join	612.17 us/op	475.73 us/op	1.29
altair processAttestation - setStatus - 2/3 committees join	716.88 us/op	605.79 us/op	1.18
altair processAttestation - setStatus - 4/5 committees join	1.0364 ms/op	853.38 us/op	1.21
altair processAttestation - setStatus - 100% committees join	1.1965 ms/op	997.52 us/op	1.20
altair processBlock - 250000 vs - 7PWei normalcase	10.417 ms/op	11.037 ms/op	0.94
altair processBlock - 250000 vs - 7PWei normalcase hashState	17.506 ms/op	18.173 ms/op	0.96
altair processBlock - 250000 vs - 7PWei worstcase	39.935 ms/op	39.992 ms/op	1.00
altair processBlock - 250000 vs - 7PWei worstcase hashState	59.509 ms/op	63.599 ms/op	0.94
phase0 processBlock - 250000 vs - 7PWei normalcase	2.0989 ms/op	2.6827 ms/op	0.78
phase0 processBlock - 250000 vs - 7PWei worstcase	29.824 ms/op	31.686 ms/op	0.94
altair processEth1Data - 250000 vs - 7PWei normalcase	601.58 us/op	536.23 us/op	1.12
getExpectedWithdrawals 250000 eb:1,eth1:1,we:0,wn:0,smpl:15	14.197 us/op	10.017 us/op	1.42
getExpectedWithdrawals 250000 eb:0.95,eth1:0.1,we:0.05,wn:0,smpl:219	72.257 us/op	84.141 us/op	0.86
getExpectedWithdrawals 250000 eb:0.95,eth1:0.3,we:0.05,wn:0,smpl:42	24.308 us/op	26.863 us/op	0.90
getExpectedWithdrawals 250000 eb:0.95,eth1:0.7,we:0.05,wn:0,smpl:18	20.897 us/op	15.207 us/op	1.37
getExpectedWithdrawals 250000 eb:0.1,eth1:0.1,we:0,wn:0,smpl:1020	209.60 us/op	166.59 us/op	1.26
getExpectedWithdrawals 250000 eb:0.03,eth1:0.03,we:0,wn:0,smpl:11777	1.5207 ms/op	1.5911 ms/op	0.96
getExpectedWithdrawals 250000 eb:0.01,eth1:0.01,we:0,wn:0,smpl:16384	1.9564 ms/op	1.8075 ms/op	1.08
getExpectedWithdrawals 250000 eb:0,eth1:0,we:0,wn:0,smpl:16384	1.6187 ms/op	2.1619 ms/op	0.75
getExpectedWithdrawals 250000 eb:0,eth1:0,we:0,wn:0,nocache,smpl:16384	3.4572 ms/op	4.5320 ms/op	0.76
getExpectedWithdrawals 250000 eb:0,eth1:1,we:0,wn:0,smpl:16384	3.0295 ms/op	2.7610 ms/op	1.10
getExpectedWithdrawals 250000 eb:0,eth1:1,we:0,wn:0,nocache,smpl:16384	6.4581 ms/op	6.7336 ms/op	0.96
Tree 40 250000 create	385.25 ms/op	412.15 ms/op	0.93
Tree 40 250000 get(125000)	211.63 ns/op	214.60 ns/op	0.99
Tree 40 250000 set(125000)	1.0028 us/op	1.0545 us/op	0.95
Tree 40 250000 toArray()	22.571 ms/op	21.457 ms/op	1.05
Tree 40 250000 iterate all - toArray() + loop	23.405 ms/op	23.558 ms/op	0.99
Tree 40 250000 iterate all - get(i)	75.678 ms/op	74.195 ms/op	1.02
MutableVector 250000 create	10.327 ms/op	10.088 ms/op	1.02
MutableVector 250000 get(125000)	6.6400 ns/op	6.5350 ns/op	1.02
MutableVector 250000 set(125000)	302.82 ns/op	288.00 ns/op	1.05
MutableVector 250000 toArray()	2.8129 ms/op	3.9755 ms/op	0.71
MutableVector 250000 iterate all - toArray() + loop	2.9255 ms/op	4.1078 ms/op	0.71
MutableVector 250000 iterate all - get(i)	1.5354 ms/op	1.6852 ms/op	0.91
Array 250000 create	2.5901 ms/op	3.2554 ms/op	0.80
Array 250000 clone - spread	1.2039 ms/op	1.0297 ms/op	1.17
Array 250000 get(125000)	0.60100 ns/op	0.50300 ns/op	1.19
Array 250000 set(125000)	0.66500 ns/op	0.58500 ns/op	1.14
Array 250000 iterate all - loop	83.265 us/op	113.17 us/op	0.74
effectiveBalanceIncrements clone Uint8Array 300000	33.135 us/op	23.943 us/op	1.38
effectiveBalanceIncrements clone MutableVector 300000	362.00 ns/op	268.00 ns/op	1.35
effectiveBalanceIncrements rw all Uint8Array 300000	177.09 us/op	182.05 us/op	0.97
effectiveBalanceIncrements rw all MutableVector 300000	86.490 ms/op	76.216 ms/op	1.13
phase0 afterProcessEpoch - 250000 vs - 7PWei	112.31 ms/op	116.38 ms/op	0.96
phase0 beforeProcessEpoch - 250000 vs - 7PWei	38.754 ms/op	30.847 ms/op	1.26
altair processEpoch - mainnet_e81889	327.48 ms/op	313.97 ms/op	1.04
mainnet_e81889 - altair beforeProcessEpoch	67.454 ms/op	58.640 ms/op	1.15
mainnet_e81889 - altair processJustificationAndFinalization	13.680 us/op	15.578 us/op	0.88
mainnet_e81889 - altair processInactivityUpdates	5.9567 ms/op	5.0689 ms/op	1.18
mainnet_e81889 - altair processRewardsAndPenalties	52.787 ms/op	66.662 ms/op	0.79
mainnet_e81889 - altair processRegistryUpdates	2.8950 us/op	2.5930 us/op	1.12
mainnet_e81889 - altair processSlashings	605.00 ns/op	443.00 ns/op	1.37
mainnet_e81889 - altair processEth1DataReset	650.00 ns/op	539.00 ns/op	1.21
mainnet_e81889 - altair processEffectiveBalanceUpdates	1.2449 ms/op	1.2514 ms/op	0.99
mainnet_e81889 - altair processSlashingsReset	3.6820 us/op	2.6530 us/op	1.39
mainnet_e81889 - altair processRandaoMixesReset	5.3490 us/op	4.4260 us/op	1.21
mainnet_e81889 - altair processHistoricalRootsUpdate	635.00 ns/op	710.00 ns/op	0.89
mainnet_e81889 - altair processParticipationFlagUpdates	1.9070 us/op	2.9580 us/op	0.64
mainnet_e81889 - altair processSyncCommitteeUpdates	1.1560 us/op	569.00 ns/op	2.03
mainnet_e81889 - altair afterProcessEpoch	124.22 ms/op	127.27 ms/op	0.98
phase0 processEpoch - mainnet_e58758	359.55 ms/op	359.90 ms/op	1.00
mainnet_e58758 - phase0 beforeProcessEpoch	144.58 ms/op	129.91 ms/op	1.11
mainnet_e58758 - phase0 processJustificationAndFinalization	15.285 us/op	13.882 us/op	1.10
mainnet_e58758 - phase0 processRewardsAndPenalties	66.987 ms/op	66.066 ms/op	1.01
mainnet_e58758 - phase0 processRegistryUpdates	10.370 us/op	11.076 us/op	0.94
mainnet_e58758 - phase0 processSlashings	534.00 ns/op	466.00 ns/op	1.15
mainnet_e58758 - phase0 processEth1DataReset	1.0510 us/op	458.00 ns/op	2.29
mainnet_e58758 - phase0 processEffectiveBalanceUpdates	1.9823 ms/op	1.3583 ms/op	1.46
mainnet_e58758 - phase0 processSlashingsReset	3.1010 us/op	2.0530 us/op	1.51
mainnet_e58758 - phase0 processRandaoMixesReset	4.0110 us/op	4.6770 us/op	0.86
mainnet_e58758 - phase0 processHistoricalRootsUpdate	531.00 ns/op	539.00 ns/op	0.99
mainnet_e58758 - phase0 processParticipationRecordUpdates	3.7220 us/op	2.9920 us/op	1.24
mainnet_e58758 - phase0 afterProcessEpoch	98.838 ms/op	97.915 ms/op	1.01
phase0 processEffectiveBalanceUpdates - 250000 normalcase	1.2364 ms/op	1.2475 ms/op	0.99
phase0 processEffectiveBalanceUpdates - 250000 worstcase 0.5	1.4387 ms/op	1.4011 ms/op	1.03
altair processInactivityUpdates - 250000 normalcase	18.932 ms/op	24.568 ms/op	0.77
altair processInactivityUpdates - 250000 worstcase	26.551 ms/op	23.642 ms/op	1.12
phase0 processRegistryUpdates - 250000 normalcase	10.800 us/op	10.271 us/op	1.05
phase0 processRegistryUpdates - 250000 badcase_full_deposits	353.67 us/op	380.47 us/op	0.93
phase0 processRegistryUpdates - 250000 worstcase 0.5	119.31 ms/op	133.40 ms/op	0.89
altair processRewardsAndPenalties - 250000 normalcase	70.228 ms/op	69.278 ms/op	1.01
altair processRewardsAndPenalties - 250000 worstcase	72.315 ms/op	71.947 ms/op	1.01
phase0 getAttestationDeltas - 250000 normalcase	7.8342 ms/op	8.2133 ms/op	0.95
phase0 getAttestationDeltas - 250000 worstcase	7.6326 ms/op	8.9916 ms/op	0.85
phase0 processSlashings - 250000 worstcase	2.2641 ms/op	2.4953 ms/op	0.91
altair processSyncCommitteeUpdates - 250000	148.92 ms/op	173.22 ms/op	0.86
BeaconState.hashTreeRoot - No change	279.00 ns/op	327.00 ns/op	0.85
BeaconState.hashTreeRoot - 1 full validator	49.467 us/op	54.734 us/op	0.90
BeaconState.hashTreeRoot - 32 full validator	486.64 us/op	617.30 us/op	0.79
BeaconState.hashTreeRoot - 512 full validator	5.0513 ms/op	5.6991 ms/op	0.89
BeaconState.hashTreeRoot - 1 validator.effectiveBalance	60.863 us/op	68.706 us/op	0.89
BeaconState.hashTreeRoot - 32 validator.effectiveBalance	823.37 us/op	1.0228 ms/op	0.80
BeaconState.hashTreeRoot - 512 validator.effectiveBalance	10.231 ms/op	13.660 ms/op	0.75
BeaconState.hashTreeRoot - 1 balances	46.781 us/op	56.762 us/op	0.82
BeaconState.hashTreeRoot - 32 balances	427.20 us/op	473.04 us/op	0.90
BeaconState.hashTreeRoot - 512 balances	3.9184 ms/op	5.1815 ms/op	0.76
BeaconState.hashTreeRoot - 250000 balances	69.718 ms/op	80.260 ms/op	0.87
aggregationBits - 2048 els - zipIndexesInBitList	17.621 us/op	20.062 us/op	0.88
regular array get 100000 times	33.022 us/op	45.652 us/op	0.72
wrappedArray get 100000 times	33.297 us/op	34.819 us/op	0.96
arrayWithProxy get 100000 times	15.967 ms/op	15.525 ms/op	1.03
ssz.Root.equals	238.00 ns/op	255.00 ns/op	0.93
byteArrayEquals	243.00 ns/op	262.00 ns/op	0.93
shuffle list - 16384 els	7.2120 ms/op	7.3421 ms/op	0.98
shuffle list - 250000 els	102.86 ms/op	107.97 ms/op	0.95
processSlot - 1 slots	8.0300 us/op	9.5040 us/op	0.84
processSlot - 32 slots	1.3888 ms/op	1.4373 ms/op	0.97
getEffectiveBalanceIncrementsZeroInactive - 250000 vs - 7PWei	55.992 ms/op	59.367 ms/op	0.94
getCommitteeAssignments - req 1 vs - 250000 vc	2.5298 ms/op	2.5965 ms/op	0.97
getCommitteeAssignments - req 100 vs - 250000 vc	3.7406 ms/op	3.8568 ms/op	0.97
getCommitteeAssignments - req 1000 vs - 250000 vc	4.0965 ms/op	4.2408 ms/op	0.97
RootCache.getBlockRootAtSlot - 250000 vs - 7PWei	5.2200 ns/op	5.2800 ns/op	0.99
state getBlockRootAtSlot - 250000 vs - 7PWei	565.03 ns/op	636.85 ns/op	0.89
computeProposers - vc 250000	10.013 ms/op	9.3891 ms/op	1.07
computeEpochShuffling - vc 250000	107.63 ms/op	107.68 ms/op	1.00
getNextSyncCommittee - vc 250000	162.35 ms/op	156.90 ms/op	1.03
computeSigningRoot for AttestationData	14.000 us/op	13.680 us/op	1.02
hash AttestationData serialized data then Buffer.toString(base64)	2.3614 us/op	2.4098 us/op	0.98
toHexString serialized data	1.0773 us/op	1.0786 us/op	1.00
Buffer.toString(base64)	228.20 ns/op	215.32 ns/op	1.06

by benchmarkbot/action

packages/beacon-node/test/perf/util/set.test.ts

dapplion · 2023-07-26T10:13:30Z

packages/beacon-node/src/network/processor/gossipQueues/indexed.ts

+   * If not, pick the last key
+   */
+  next(): T[] | null {
+    let key: string | null = this.minChunkSizeKeys.last();


I don't think this approach is best, consider this case:

a key receives minChunkSize-1 items at time T1 and then 1 extra item at time T2

It is appended to minChunkSizeKeys at time T2 and will be picked up first, but most of its items are at time T1

Similarly if a key does not receive minChunkSize it will always be de-prioritized. This approach risks respecting FIFO for some load patterns.

The goals for a FIFO queue with indexing that intends to batch should be:

Prioritize latest items

Buffer items to allow batching, but limit buffering by max wait or max length

For 1. in batches, what decide the timeliness of a batch?

First arrival

Latest arrival

Avg of items

yes, this is not ideal. Initially I did like this:

always pick the last item (like LIFO)

process it along with min(64, number of messages with same data/key)

the performance is far from the current approach since it can't batch much messages. I think it's not easy to pick a strategy, we need to:

define some criterias to pick a strategy, one is: job wait time + job time + avg(forwarded messages)

do some dynamic configurations here, configure it at runtime and compare

dapplion · 2023-07-26T13:04:27Z

It's not easy to enforce FIFO and fairness rules having multiple list of objects that arrive at different times. We do FIFO for attestations because the most recent attestations have the most value to us.

Given a list of single-bit gossip attestations (assuming all valid) their value is equal the sum of 1 / incl_delay. Prioritizing each list by first or last arrival time ignores the count of attestations a list have and could then, ignore big lists of not recent attestations.

One approach that would be fair but maybe expensive is to compute this "score" everytime

type MsgList = {
  avgRecvTimestampMs: number;
  items: GossipMsg[];
};
type GossipMsgId = string;
type GossipMsg = any;

const msgListScore = (a: MsgList): number => a.items.length / Math.max(1000, Date.now() - a.avgRecvTimestampMs);
const getKey = (_: GossipMsg): GossipMsgId => "TODO";

class Queue {
  queuesMap: Map<GossipMsgId, MsgList>;
  queuesList: MsgList[];

  add(item: GossipMsg): void {
    const key = getKey(item);
    let queue = this.queuesMap.get(key);

    if (!queue) {
      queue = {avgRecvTimestampMs: Date.now(), items: [item]};
      this.queuesMap.set(key, queue);
    } else {
      queue.items.push(item);
      // Compute average adding 1 value: avg = (avg * N + x) / (N + 1)
      queue.avgRecvTimestampMs =
        (queue.items.length * queue.avgRecvTimestampMs + Date.now()) / (queue.items.length + 1);
    }
  }

  getNext(): GossipMsg[] | undefined {
    // Sort in place by score, can be optimized runing this sometimes only
    this.queuesList.sort((a, b) => msgListScore(a) - msgListScore(b));

    return this.queuesList.shift()?.items;
  }
}

twoeths · 2023-08-02T09:15:11Z

One approach that would be fair but maybe expensive is to compute this "score" everytime

@dapplion yes that's expensive, tested the IndexedGossipQueueAvgTime in the last 2h

dapplion · 2023-08-03T13:30:20Z

One approach that would be fair but maybe expensive is to compute this "score" everytime

@dapplion yes that's expensive, tested the IndexedGossipQueueAvgTime in the last 2h

please @tuyennhv do not post spiky graphs, post either histograms or % of values above a certain threshold, or an avg_over_time

dapplion · 2023-08-09T13:55:31Z

The queue must have some mechanism to pick up groups of attestation datas that do not reach the minimum count.

twoeths · 2023-08-11T04:22:56Z

the current mechanism works as below:

attDataHex0: 30 items
attDataHex1: 35 items
attDataHex2: 1 items
attDataHex3: 2 items
attDataHex4: 2 item

with minChunkSize=32 it'll pick 35 items of attDatHex1, then 2 items of attDataHex4, then 2 items of attDataHex3, then 1 item of attDataHex2 then 30 items of attDataHex0. If after we process attDataHex1, attDataHex${n} has more than 32 items then we'd process it next, the delay is worth because we batch more

The queue must have some mechanism to pick up groups of attestation datas that do not reach the minimum count.

@dapplion what do you expect, or want to improve with the above example? I can give it a try in a real node

dapplion · 2023-08-13T09:44:25Z

Let's go with this and have histogram metrics to measure delay in the queue since first item

wemeetagain · 2023-08-29T16:44:45Z

🎉 This PR is included in v1.11.0 🎉

twoeths added 4 commits July 25, 2023 15:41

feat: implement IndexedGossipQueue, LinearGossipQueue

2ed33cd

feat: add deleteFirst() and deleteLast() to LinkedList

f6b7257

feat: use LinkedList inside OrderedSet

ebda1bb

chore: fix lint in beacon-node

a6e910a

twoeths marked this pull request as ready for review July 26, 2023 07:03

twoeths requested a review from a team as a code owner July 26, 2023 07:03

dapplion reviewed Jul 26, 2023

View reviewed changes

packages/beacon-node/test/perf/util/set.test.ts Outdated Show resolved Hide resolved

dapplion reviewed Jul 26, 2023

View reviewed changes

twoeths added 2 commits July 27, 2023 10:53

chore: dedup runsFactor

ed65f64

feat: implement IndexedGossipQueueAvgTime

b0952ed

dapplion approved these changes Aug 13, 2023

View reviewed changes

twoeths merged commit 5edda2b into unstable Aug 14, 2023
11 checks passed

twoeths deleted the tuyen/indexed_gossip_queue branch August 14, 2023 01:09

twoeths mentioned this pull request Aug 20, 2023

feat: verify gossip attestation messages in batch #5896

Merged

twoeths mentioned this pull request Oct 4, 2024

chore: remove IndexedGossipQueueAvgTime #7125

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: implement IndexedGossipQueue #5803

feat: implement IndexedGossipQueue #5803

twoeths commented Jul 26, 2023

github-actions bot commented Jul 26, 2023 •

edited

Loading

dapplion Jul 26, 2023

twoeths Jul 26, 2023

dapplion commented Jul 26, 2023 •

edited

Loading

twoeths commented Aug 2, 2023

dapplion commented Aug 3, 2023 •

edited

Loading

dapplion commented Aug 9, 2023

twoeths commented Aug 11, 2023

dapplion commented Aug 13, 2023

wemeetagain commented Aug 29, 2023

feat: implement IndexedGossipQueue #5803

feat: implement IndexedGossipQueue #5803

Conversation

twoeths commented Jul 26, 2023

github-actions bot commented Jul 26, 2023 • edited Loading

Performance Report

dapplion Jul 26, 2023

Choose a reason for hiding this comment

twoeths Jul 26, 2023

Choose a reason for hiding this comment

dapplion commented Jul 26, 2023 • edited Loading

twoeths commented Aug 2, 2023

dapplion commented Aug 3, 2023 • edited Loading

dapplion commented Aug 9, 2023

twoeths commented Aug 11, 2023

dapplion commented Aug 13, 2023

wemeetagain commented Aug 29, 2023

github-actions bot commented Jul 26, 2023 •

edited

Loading

dapplion commented Jul 26, 2023 •

edited

Loading

dapplion commented Aug 3, 2023 •

edited

Loading