benches: move sender to a spawned task in `watch` benchmark #6034

satakuma · 2023-09-26T13:46:17Z

Motivation

The PR #5464 introduced a new bucket notifier to reduce contention in the watch channel. That change was motivated by the benchmark added in #5472. However, this benchmark does not present favorable results for the newer notifier if we compare it with the older solution.

Results with the new notifier:
contention_resubscribe time: [81.507 ms 82.098 ms 82.740 ms]
Results without it:
contention_resubscribe time: [38.254 ms 38.938 ms 39.618 ms]

This is most likely due to sender task not running on a worker thread.

Solution

Change the existing watch channel benchmark by moving the benchmarked loop to a spawned task, which will be executed by one of the worker threads. With this change the results are more comparable:

new notifier:

contention_resubscribe/10    time:   [930.74 µs 937.37 µs 942.98 µs]
contention_resubscribe/100   time:   [3.8498 ms 3.8565 ms 3.8634 ms]
contention_resubscribe/500   time:   [16.106 ms 16.111 ms 16.116 ms]
contention_resubscribe/1000  time:   [33.519 ms 33.881 ms 34.250 ms]

old notifier:

contention_resubscribe/10    time:   [901.93 µs 914.50 µs 925.82 µs]
contention_resubscribe/100   time:   [4.0990 ms 4.1044 ms 4.1106 ms]
contention_resubscribe/500   time:   [16.800 ms 16.810 ms 16.823 ms]
contention_resubscribe/1000  time:   [32.479 ms 32.497 ms 32.516 ms]

Note that since the addition of the new notifier, some optimizations (#5503) have been made to the underlying Notify primitive. Those also reduced lock contention so the new bucket solution no longer presents as significant improvements (or any improvement at all) in this benchmark as it might at the time of its addition. I still think it's worth keeping around as it improves performance with more worker threads. Here are the results using 24 worker threads (all of the above results were obtained using 6):

new notifier:

contention_resubscribe/10    time:   [881.98 µs 893.00 µs 902.81 µs]
contention_resubscribe/100   time:   [3.6447 ms 3.6536 ms 3.6621 ms]
contention_resubscribe/500   time:   [21.162 ms 21.317 ms 21.486 ms]
contention_resubscribe/1000  time:   [43.518 ms 43.927 ms 44.346 ms]

old notifier:

contention_resubscribe/10    time:   [930.67 µs 932.05 µs 933.49 µs]
contention_resubscribe/100   time:   [4.5549 ms 4.7225 ms 4.8852 ms]
contention_resubscribe/500   time:   [41.734 ms 41.789 ms 41.845 ms]
contention_resubscribe/1000  time:   [86.575 ms 86.634 ms 86.694 ms]

For benchmarking, I used a 12-core x86_64 machine running Linux 6.5.4.

Darksonn

Interesting observation. Thanks. What do you think we should do about the less-contention change?

Darksonn · 2023-09-28T09:28:33Z

After all, there's also a disadvantage to having it, namely that it makes the allocation a lot larger.

satakuma · 2023-10-01T21:20:21Z

I ran the original benchmark from the thread in #5403 (tijsvd/tokio-watch-benchmark) and the results don't seem to favor BigNotify either, although they vary quite a bit from run to run.

new notifier (BigNotify)

TOKIO WATCH
running single thread with spawned sender
start
avg=515.938µs  <490.603µs [ 492.752µs 495.533µs 545.552µs ] 582.422µs>
running multi thread with spawned sender
start
avg=485.695µs  <296.226µs [ 444.833µs 525.262µs 544.252µs ] 562.281µs>
CUSTOM WATCH
running multi thread with spawned sender
start
avg=236.832µs  <203.757µs [ 225.777µs 238.187µs 249.106µs ] 264.636µs>

old notifier (just Notify)

TOKIO WATCH
running single thread with spawned sender
start
avg=482.403µs  <461.733µs [ 462.903µs 464.243µs 509.432µs ] 513.813µs>
running multi thread with spawned sender
start
avg=383.811µs  <347.524µs [ 368.795µs 383.414µs 397.754µs ] 419.423µs>
CUSTOM WATCH
running multi thread with spawned sender
start
avg=218.534µs  <198.347µs [ 210.887µs 218.927µs 226.486µs ] 237.147µs>

The author of this benchmark mentioned that he used an 8-vcore machine, so I used 8 worker threads as well.

I think that if I can't observe any benefit from BigNotify using the benchmarking method which led to its addition and BigNotify makes the allocation larger, then it might be better to simply remove it. Alternatively, we can think about a heuristic restricting BigNotify to only runtimes with, for example, more than 12 worker threads, although I'm not sure if that's possible. Anyway, the custom implementation from the benchmark still runs much faster on my machine, so I think there is room for improvement when it comes to reducing contention.

Darksonn · 2023-10-01T21:42:38Z

If you think it doesn't make a difference, then go ahead and undo the BigNotify change.

satakuma · 2023-10-27T14:49:26Z

Sorry for the late response. I ran the benchmarks mentioned above on other machines with significantly weaker CPUs and BigNotify was actually more than 2x faster there, so I am no longer inclined to remove it. I guess its performance depends heavily on the hardware. Unfortunately, I don't have time right now to investigate it further. Let's leave it as is for now.

Darksonn · 2023-10-27T16:30:30Z

Sounds good to me.

benches: move sender to a spawned task in watch benchmark

67519a8

Darksonn added A-tokio Area: The main tokio crate M-sync Module: tokio/sync A-benches Area: Benchmarks labels Sep 26, 2023

Darksonn approved these changes Sep 28, 2023

View reviewed changes

Darksonn merged commit ca89c5b into tokio-rs:master Sep 28, 2023
73 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

benches: move sender to a spawned task in `watch` benchmark #6034

benches: move sender to a spawned task in `watch` benchmark #6034

satakuma commented Sep 26, 2023

Darksonn left a comment

Darksonn commented Sep 28, 2023

satakuma commented Oct 1, 2023

Darksonn commented Oct 1, 2023

satakuma commented Oct 27, 2023

Darksonn commented Oct 27, 2023

benches: move sender to a spawned task in watch benchmark #6034

benches: move sender to a spawned task in watch benchmark #6034

Conversation

satakuma commented Sep 26, 2023

Motivation

Solution

Darksonn left a comment

Choose a reason for hiding this comment

Darksonn commented Sep 28, 2023

satakuma commented Oct 1, 2023

Darksonn commented Oct 1, 2023

satakuma commented Oct 27, 2023

Darksonn commented Oct 27, 2023

benches: move sender to a spawned task in `watch` benchmark #6034

benches: move sender to a spawned task in `watch` benchmark #6034