-
Notifications
You must be signed in to change notification settings - Fork 1k
Description
Benchmarking end to end transaction throughput performance
It would be nice for substrate to have a setup/demonstration/documentation for of E2E benchmarks.
There are plenty of benchmarks in the codebase and a runtime benchmarking framework,
however we were not able to find E2E benchmarks demonstrating throughput of the network.
Furthermore, our benchmarks show peak throughput at around 800 transactions per second,
which is a bit lower than claimed 1000 TPS.
We would like to know where the discrepancy comes from, how to achieve higher throughput
and learn how to analyze Substrate's performance.
Setup
Since we haven't been able to find the setup for E2E benchmarks, we've implemented the following setup:
- 4x AWS t3.xlarge* instances, each running a substrate node in the same network
- 1x custom client node that creates transactions over HTTP RPC, evenly distributing between nodes
- 1x Prometheus server collecting stats from substrate nodes
* t3.xlarge were used because they supposedly meet the server requirements
Possible limits
Substrate has built-in limits that we suppose ensure a smooth operation of substrate nodes.
However, to find the limits we would like to disable those limits.
We have found the following limits:
- Maximum block weight
- Maximum block byte length
- Transaction pool queue limit
They were increased with this change to substrate-node-template.
Observations
- The peak transaction throughput measured at 800 TPS.
- Increasing the client TPS above that decreased the number of transactions in block, meaning more transactions were dropped as the load increased.
- CPU utilization during this benchmark was ~50% on substrate nodes and ~10% on the client node. There is still room for compute to spare.
- The client was capable of generating up to 3000 TPS when used on 12-node network. However, most transactions were dropped. This demonstrates that the client was not the bottleneck.
- We were not able to find where substrate node spends its CPU time. This is complicated by the use of async, and lack of debug info by default. Any suggestions on how to collect performance info (a la flamegraphs) would be greatly appreciated.
We would like to know if we're missing anything from this approach, whether these results are reasonable and if you have any suggestions on how to evaluate end to end substrate performance.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status