Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

solBanknStgTx00 and solBanknStgTx01 threads peg the CPU #636

Open
bji opened this issue Apr 7, 2024 · 6 comments
Open

solBanknStgTx00 and solBanknStgTx01 threads peg the CPU #636

bji opened this issue Apr 7, 2024 · 6 comments
Assignees

Comments

@bji
Copy link

bji commented Apr 7, 2024

On investigation of issues around performance during high packet ingestion rates, I noticed that on all validators I have access to (including two on mainnet and one on testnet; versions 1.17.28, 1.17.28, and 1.18.9 respectively), the threads solBanknStgTx00 and solBanknStgTx01 appear to use very nearly 100% CPU each continuously. Here's a simple process listing:

$ ps -eL | grep BanknS
2363290 2364914 ?        11-23:49:41 solBanknStgTx00
2363290 2364915 ?        11-22:12:47 solBanknStgTx01
2363290 2364916 ?        07:15:09 solBanknStgTx02
2363290 2364917 ?        07:26:36 solBanknStgTx03
2363290 2364918 ?        07:15:41 solBanknStgTx04
2363290 2364919 ?        07:25:15 solBanknStgTx05

By my calculation, the other banking threads use approximately 2.5% the CPU of those first two threads.

solBanknStgTx00 is the gossip vote ingestion thread; and solBanknStgTx01 is the TPU vote ingestion thread. It is not entirely clear what causes this high CPU usage although code analysis would seem to indicate that there is nothing preventing these threads from looping continuously as long as there are unprocessed packets in their packet storage queues.

In addition, on reading the code, it looks to me like any packet which arrives in the 20 slots leading up to the validator's leader slots will invoke the ForwardAndHold logic, and at that point I cannot see any way that the packet would leave the queue until the Consume phase is reached 20 slots later. In the meantime it looks to me like the vote TPU thread will continuously re-evaluate the packets it has Forwarded and Held, repeating the forwarding process over and over again. I cannot see how this doesn't lead to a spew of duplicate forwards out of the leader for all packets received in the 20 slots leading up to its leader slots.

I do not fully understand the code and may be wrong; but the high CPU usage of these threads combined with my evaluations might necessitate some deeper analysis of the code involved.

@bji
Copy link
Author

bji commented Apr 7, 2024

I see now where the re-forwarding of packets is prevented. In LatestUnprocessedVotes::get_and_insert_forwardable_packets, only votes for which is_forwarded() returns false are included.

However the flow of the loop is definitely wasteful; it does an awful lot of work re-evaluating already forwarded packets in a tight loop. I do not know the degree to which this contributes to the high CPU usage.

@apfitzge
Copy link

apfitzge commented Apr 8, 2024

Yeah looking at the vote threads, they do seem wasteful in their usage. I think we could probably make a few simple changes here to reduce CPU usage in short-term:

  1. Gossip thread never does any actual work. Only receives/buffers threads. However it's receive timeout is still chosen the same as the other threads. In PacketReceiver::get_receive_timeout. This should be adjusted so the gossip thread will sleep for a much longer period.
  2. TpuVote thread might keep track of the unforwarded packet count and only loop/process again if there are actually unforwarded packets. If no unforwarded packets, should also try receiving for a longer period of time.

Long-term I think we may be able to consolidate the 2 threads into single thread, also remove the holding period before gossip votes are released to the thread.

@steviez
Copy link

steviez commented Apr 9, 2024

FWIW - I think this issue is tied to the old scheduler (default in v1.17 and v1.18); the new scheduler (default in master) looks to have much lower usage when the leader is not near its' leader slots

Above statement is incorrect - see #636 (comment)

@apfitzge
Copy link

New scheduler doesn't change the vote threads though, so I'd not expect new scheduler to have any effect here.

@steviez
Copy link

steviez commented Apr 10, 2024

New scheduler doesn't change the vote threads though, so I'd not expect new scheduler to have any effect here.

I went back and revisited my data that made me make this claim, and it looks like the time window that I sampled was "abnormal" in that it had lower utilization on these threads. The output below show thread name (comm) and cumulative scheduled time divided by cumulative process time (pcpu).

Tip-of-master with default (new scheduler)

$ ps -p 3753527 -T -o comm,pcpu | grep BanknStg
solBanknStgTx00 99.6
solBanknStgTx01 99.3

Tip-of-master with old scheduler (--block-production-method thread-local-multi-iterator):

$ ps -p 953856 -T -o comm,pcpu | grep Bankn
solBanknStgTx00 99.6
solBanknStgTx01 99.4
solBanknStgTx02  0.4
solBanknStgTx03  0.4
solBanknStgTx04  0.4
solBanknStgTx05  0.5

So yeah, my mistake and agreed with you there

@apfitzge
Copy link

apfitzge commented Apr 11, 2024

#743 related to this, kept skipping on a length check.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants