gh-122881: Reduce asyncio heapq scheduling overhead #122882

bdraco · 2024-08-10T13:03:25Z

Wrap the TimerHandle in a tuple with the when value at the front to avoid having to call TimerHandle.__lt__ for heapq operations.

Issue: Reduce asyncio heapq overhead #122881

Wrap the TimerHandle in tuples with the when value at the front to avoid having to call `TimerHandle.__lt__` when working with the `heapq`

Lib/asyncio/base_events.py

Misc/NEWS.d/next/Library/2024-08-10-13-10-45.gh-issue-122881.z_n6W-.rst

…_n6W-.rst Co-authored-by: Peter Bierma <zintensitydev@gmail.com>

picnixz

Some nitpicks.

Misc/NEWS.d/next/Library/2024-08-10-13-10-45.gh-issue-122881.z_n6W-.rst

Lib/asyncio/base_events.py

…_n6W-.rst Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>

…eed_up_async_schedule

Lib/asyncio/base_events.py

Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>

picnixz

If you have numbers, you could also add them to the NEWS entry, saying that you improved the performances by a factor X for instance, otherwise I'm good (still requires a core dev for the final acceptance).

bdraco · 2024-08-12T13:42:00Z

If you have numbers, you could also add them to the NEWS entry, saying that you improved the performances by a factor X for instance, otherwise I'm good (still requires a core dev for the final acceptance).

It works out to a

~10.2% speed up to in _run_once on my production Home Assistant instance but its already well optimized to reduce scheduling as much as possible so I'd expect other use cases have more substantial speed ups
~9.5% speed up scheduling and running timer handles

bdraco · 2024-08-12T13:47:27Z

Added the number for the TimerHandle improvement. _run_once is going to vary greatly based on the use case since some use cases will schedule far more than others. My production use case was previously an 18% speed up with _run_once before aiohttp was optimized to remove a lot of unneeded TimerHandle creation.

bdraco · 2024-08-12T13:51:58Z

Thanks!

itamaro · 2024-08-12T17:20:51Z

I have a few questions

could you please split out all the attribute caching optimizations to a separate PR (and benchmark them separately)? it's not clear that they improve things in a measurable way, and it hurts readability.
speaking of benchmarking - the microbenchmarks look good, but is there a measurable impact on the asyncio pyperformance benchmarks?
what is the memory impact? iiuc, this would use strictly more memory (extra tuple and when per heapq item), but I have no idea whether the increase is meaningful or negligible.

bdraco · 2024-08-12T17:35:17Z

I have a few questions

could you please split out all the attribute caching optimizations to a separate PR (and benchmark them separately)? it's not clear that they improve things in a measurable way, and it hurts readability.

This has already been discussed above. Is this strictly required?

speaking of benchmarking - the microbenchmarks look good, but is there a measurable impact on the asyncio pyperformance benchmarks?

I'm not sure the pyperformance would be a better benchmark here as I think the real-world use case impact is more important.

what is the memory impact? iiuc, this would use strictly more memory (extra tuple and when per heapq item), but I have no idea whether the increase is meaningful or negligible.

>>> (1.12,object()).__sizeof__()
40

The memory impact should be negligible compared to the cost of TimerHandle

bdraco · 2024-08-24T21:29:14Z

In recent versions of Python, attribute caching of bound methods usually hurts performance. So yes, let's remove it, unless we can demonstrate meaningful improvement in a benchmark

@hauntsaninja I pushed the commit I had original held off on

a5b6647 but than I realized you deleted your comment

Lib/asyncio/base_events.py

hauntsaninja · 2024-08-24T21:36:44Z

Ah, I deleted my comment because only the ready_popleft was a pessimisation and I didn't really want to get into it :-) That said, I do like having self if it's cheap, I find it easier to reason about state. Thanks for the change!

bdraco · 2024-08-24T21:40:54Z

Ah, I deleted my comment because only the ready_popleft was a pessimisation and I didn't really want to get into it :-) That said, I do like having self if it's cheap, I find it easier to reason about state. Thanks for the change!

The self changes made a tiny difference compared to avoiding the __lt__ overhead so it not worthing getting hung up on anyways.

hauntsaninja

Thanks again!

mdboom · 2024-08-29T13:14:51Z

I ran this PR against the pyperformance suite on our benchmarking hardware. The results overall are basically "no change", within the noise. The vast majority of the benchmarks there don't use async, however, we do have the async benchmarks broken out separately, and even there the results are kind of inconclusive. The async benchmarks in pyperformance are actually known to have a great deal of inherent variability, so, honestly it's hard to conclude too much from them.

Results on async benchmarks only

Platform	Speedup
Linux aarch64	1% slower
Linux x86_64	0% slower
Windows x86_64	1% faster
Windows x86	2% faster
macOS arm	1% slower

From these results, I'd say there's no obvious win or obvious red flag to merging this. The microbenchmark in #122881 shows a significant improvement, and if the additional code complexity is acceptable to others, this is probably fine to merge.

bdraco · 2024-08-29T17:00:28Z

I took a look at https://github.com/python/pyperformance/tree/main/pyperformance and I don't see any asyncio benchmarks that will generate a significant amount of TimerHandles so I wouldn't expect you would see any significant change in the benchmarks.

bdraco · 2024-08-29T17:09:52Z

For context, I've been working on addressing some performance complaints from aiohttp users. What finally pushed me to submit this upstream was how much more time is spent in aiohttp.web_ws.WebSocketResponse.receive() and aiohttp.client.ClientSession.get() with a timeout.

1st1

In my opinion these sort of micro-optimizations aren't worth it as they mostly just harm readability. The effect of this optimization will be barely detectable. If you want performance - use uvloop.

-1.

willingc · 2024-09-10T20:59:08Z

For context, I've been working on addressing some performance complaints from aiohttp users. What finally pushed me to submit this upstream was how much more time is spent in aiohttp.web_ws.WebSocketResponse.receive() and aiohttp.client.ClientSession.get() with a timeout.

Can you link to the issues re: aiohttp?

willingc

I would like to see the aiohttp performance issues before I weigh in on the need for this change.

Dreamsorcerer · 2024-09-10T21:13:00Z

I would like to see the aiohttp performance issues before I weigh in on the need for this change.

bdraco was analysing performance issues, but some of the discussion around that is in aio-libs/aiohttp#8608 and the thread: aio-libs/aiohttp#8608 (comment)

I think one concern here is that .reschedule is a public, high-level API: https://docs.python.org/3/library/asyncio-task.html#asyncio.Timeout.reschedule
And this can result in this heapq rescheduling, so if a user does use that a lot it could have a significant performance impact.

willingc · 2024-09-12T17:17:24Z

I think one concern here is that .reschedule is a public, high-level API: https://docs.python.org/3/library/asyncio-task.html#asyncio.Timeout.reschedule
And this can result in this heapq rescheduling, so if a user does use that a lot it could have a significant performance impact.

Thanks @Dreamsorcerer for additional information.

I'm at -0 in my view on this change at present. Personally, I would need to see more evidence of real world impact for me to support this optimization. I will remove the "Do Not Merge" label in case another core team member feels strongly to merge this.

Thanks @bdraco for the PR too and the detailed explanations.

vstinner · 2024-09-18T17:51:43Z

#122881 (comment) microbenchmark doesn't use modified code: it doesn't use the asyncio event loop :-( It's a benchmark on heapq.heappush() and heapq.heappop().

Can you write a benchmark using the asyncio event loop? For example, schedule 1000 callbacks with call_at().

bdraco · 2024-09-18T18:21:27Z

#122881 (comment) microbenchmark doesn't use modified code: it doesn't use the asyncio event loop :-( It's a benchmark on heapq.heappush() and heapq.heappop().

Can you write a benchmark using the asyncio event loop? For example, schedule 1000 callbacks with call_at().

#122881 (comment)

Is this what you are looking for?

itamaro · 2024-09-19T14:47:16Z

As a benchmarking data point - I applied a minimally modified version (ported to cinder 3.10) of this PR to Instagram Server, and there was no measurable perf impact.

1st1 · 2024-09-23T17:43:43Z

As a benchmarking data point - I applied a minimally modified version (ported to cinder 3.10) of this PR to Instagram Server, and there was no measurable perf impact.

Yeah, basically what I expected. I think let's close it, unless the author comes up with a more significant and provable optimization.

bdraco · 2024-09-23T18:01:04Z

I did link to a benchmark with call_at above #122881 (comment)

If the instagram server isn't doing a lot of call_ats, I wouldn't expect any performance difference

Reduce asyncio heapq scheduling overhead

564394e

Wrap the TimerHandle in tuples with the when value at the front to avoid having to call `TimerHandle.__lt__` when working with the `heapq`

bedevere-app bot mentioned this pull request Aug 10, 2024

Reduce asyncio heapq overhead #122881

Closed

bdraco mentioned this pull request Aug 10, 2024

Attempt to speed up ready bdraco/cpython#5

Closed

bdraco commented Aug 10, 2024

View reviewed changes

Lib/asyncio/base_events.py Show resolved Hide resolved

bdraco mentioned this pull request Aug 10, 2024

Improve performance of WebSockets when there is no timeout aio-libs/aiohttp#8660

Merged

📜🤖 Added by blurb_it.

3b46d71

bdraco commented Aug 10, 2024

View reviewed changes

Lib/asyncio/base_events.py Show resolved Hide resolved

Update Lib/asyncio/base_events.py

f01857d

bdraco marked this pull request as ready for review August 10, 2024 14:09

bdraco requested review from 1st1, asvetlov, gvanrossum, kumaraditya303 and willingc as code owners August 10, 2024 14:09

bedevere-app bot added the awaiting review label Aug 10, 2024

ZeroIntensity reviewed Aug 10, 2024

View reviewed changes

Misc/NEWS.d/next/Library/2024-08-10-13-10-45.gh-issue-122881.z_n6W-.rst Outdated Show resolved Hide resolved

Update Misc/NEWS.d/next/Library/2024-08-10-13-10-45.gh-issue-122881.z…

a866d33

…_n6W-.rst Co-authored-by: Peter Bierma <zintensitydev@gmail.com>

picnixz reviewed Aug 11, 2024

View reviewed changes

Misc/NEWS.d/next/Library/2024-08-10-13-10-45.gh-issue-122881.z_n6W-.rst Outdated Show resolved Hide resolved

Lib/asyncio/base_events.py Outdated Show resolved Hide resolved

bdraco and others added 4 commits August 11, 2024 06:42

Update Misc/NEWS.d/next/Library/2024-08-10-13-10-45.gh-issue-122881.z…

0205d0b

…_n6W-.rst Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>

tweaks from review comments

fdda416

Merge remote-tracking branch 'bdraco/speed_up_async_schedule' into sp…

88a04d4

…eed_up_async_schedule

Merge branch 'main' into speed_up_async_schedule

345c939

picnixz self-requested a review August 12, 2024 09:20

picnixz reviewed Aug 12, 2024

View reviewed changes

Lib/asyncio/base_events.py Outdated Show resolved Hide resolved

Lib/asyncio/base_events.py Outdated Show resolved Hide resolved

bdraco and others added 2 commits August 12, 2024 07:33

tweak from suggestion

4b9aed8

Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>

Merge branch 'main' into speed_up_async_schedule

6753682

picnixz approved these changes Aug 12, 2024

View reviewed changes

bedevere-app bot added awaiting core review and removed awaiting review labels Aug 12, 2024

access self more often

a5b6647

Merge branch 'main' into speed_up_async_schedule

5ccc55f

bdraco commented Aug 24, 2024

View reviewed changes

Lib/asyncio/base_events.py Outdated Show resolved Hide resolved

Update Lib/asyncio/base_events.py

f78838c

hauntsaninja approved these changes Aug 24, 2024

View reviewed changes

bedevere-app bot added awaiting merge and removed awaiting core review labels Aug 24, 2024

1st1 reviewed Sep 10, 2024

View reviewed changes

willingc reviewed Sep 10, 2024

View reviewed changes

willingc added the DO-NOT-MERGE label Sep 10, 2024

willingc removed the DO-NOT-MERGE label Sep 12, 2024

1st1 closed this Sep 23, 2024

Uh oh!

gh-122881: Reduce asyncio heapq scheduling overhead #122882

gh-122881: Reduce asyncio heapq scheduling overhead #122882

Uh oh!

Conversation

bdraco commented Aug 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

picnixz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

picnixz left a comment

Choose a reason for hiding this comment

Uh oh!

bdraco commented Aug 12, 2024

Uh oh!

bdraco commented Aug 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bdraco commented Aug 12, 2024

Uh oh!

itamaro commented Aug 12, 2024

Uh oh!

bdraco commented Aug 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bdraco commented Aug 24, 2024

Uh oh!

Uh oh!

hauntsaninja commented Aug 24, 2024

Uh oh!

bdraco commented Aug 24, 2024

Uh oh!

hauntsaninja left a comment

Choose a reason for hiding this comment

Uh oh!

mdboom commented Aug 29, 2024

Uh oh!

bdraco commented Aug 29, 2024

Uh oh!

bdraco commented Aug 29, 2024

Uh oh!

1st1 left a comment

Choose a reason for hiding this comment

Uh oh!

willingc commented Sep 10, 2024

Uh oh!

willingc left a comment

Choose a reason for hiding this comment

Uh oh!

Dreamsorcerer commented Sep 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

willingc commented Sep 12, 2024

Uh oh!

vstinner commented Sep 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bdraco commented Sep 18, 2024

Uh oh!

itamaro commented Sep 19, 2024

Uh oh!

1st1 commented Sep 23, 2024

Uh oh!

bdraco commented Sep 23, 2024

Uh oh!

Uh oh!

bdraco commented Aug 10, 2024 •

edited

Loading

bdraco commented Aug 12, 2024 •

edited

Loading

bdraco commented Aug 12, 2024 •

edited

Loading

Dreamsorcerer commented Sep 10, 2024 •

edited

Loading

vstinner commented Sep 18, 2024 •

edited

Loading