Description
The reported publisher confirm latency is quite different based on whether I have 1 perf-test instance with many publishers or I split these publishers between multiple perf-test instances. The total load is the same, so it's either that perf-test struggles to precisely report the latency as the number of publishers grows (however, the total is not very high and I have plenty of CPU cores available) or perhaps the reported latency is correct but the messages are batched together despite producer-random-startup-delay being set? Note, below I show the single instance with -prsd 15
but smaller instances without it to reduce the odds of this being the problem (if batching was indeed the problem, starting 5 instances without -prsd
should create five batches).
Here's how I start a single instance:
perf-test -x 5000 -y 0 -P 1 -c 1 -u test -ad false -f persistent -s 1000 -qa x-queue-version=2 -ms -mpr -prsd 15
And here's how I start five instances:
perf-test -x 1000 -y 0 -P 1 -c 1 -u test -ad false -f persistent -s 1000 -qa x-queue-version=2 -ms -mpr --metrics-prometheus-port 8081
perf-test -x 1000 -y 0 -P 1 -c 1 -u test -ad false -f persistent -s 1000 -qa x-queue-version=2 -ms -mpr --metrics-prometheus-port 8082
perf-test -x 1000 -y 0 -P 1 -c 1 -u test -ad false -f persistent -s 1000 -qa x-queue-version=2 -ms -mpr --metrics-prometheus-port 8083
perf-test -x 1000 -y 0 -P 1 -c 1 -u test -ad false -f persistent -s 1000 -qa x-queue-version=2 -ms -mpr --metrics-prometheus-port 8084
perf-test -x 1000 -y 0 -P 1 -c 1 -u test -ad false -f persistent -s 1000 -qa x-queue-version=2 -ms -mpr --metrics-prometheus-port 8085
Sample output from a -x 1000
instance:
id: test-084423-847, time: 401.342s, sent: 1001 msg/s, confirmed: 998 msg/s, nacked: 0 msg/s, min/median/75th/95th/99th confirm latency: 0/22/26/29/30 ms
id: test-084423-847, time: 402.342s, sent: 999 msg/s, confirmed: 1001 msg/s, nacked: 0 msg/s, min/median/75th/95th/99th confirm latency: 0/40/43/45/46 ms
id: test-084423-847, time: 403.342s, sent: 1000 msg/s, confirmed: 999 msg/s, nacked: 0 msg/s, min/median/75th/95th/99th confirm latency: 0/40/42/44/45 ms
Sample output from a -x 5000
instance:
id: test-085129-774, time: 271.503s, sent: 5000 msg/s, confirmed: 5000 msg/s, nacked: 0 msg/s, min/median/75th/95th/99th confirm latency: 0/195/204/215/219 ms
id: test-085129-774, time: 272.503s, sent: 5000 msg/s, confirmed: 5000 msg/s, nacked: 0 msg/s, min/median/75th/95th/99th confirm latency: 0/215/227/240/250 ms
id: test-085129-774, time: 273.503s, sent: 5001 msg/s, confirmed: 4999 msg/s, nacked: 0 msg/s, min/median/75th/95th/99th confirm latency: 0/228/238/248/252 ms
Single instance reports roughly 5 times higher latency than each of the individual instances
Activity