Skip to content

gh-132380: Use unicode hash/compare for the mcache. #133669

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

nascheme
Copy link
Member

@nascheme nascheme commented May 8, 2025

This allows the type lookup cache to work with non-interned strings.

pyperformance results

This allows cache to work with non-interned strings.
@picnixz
Copy link
Member

picnixz commented May 8, 2025

(just adding the skip news label so that you don't get pinged by the bot everytime you push, saying "failed checks")

nascheme added 3 commits May 8, 2025 11:23
Ensure we don't read the cache in the case the 'name' argument is a
non-exact unicode string.  It's possible to have an overridden `__eq__`
method and it using `_PyUnicode_Equal()` in that case would be wrong.
@nascheme nascheme removed the skip news label May 8, 2025
@ngoldbaum
Copy link
Contributor

ngoldbaum commented May 8, 2025

I cherry-picked this PR onto the 3.14 branch and built CPython and LibCST. I had to combine Instagram/LibCST#1324 and Instagram/LibCST#1295 and also back out the Python-level caching I added in Instagram/LibCST#1295, since that's unnecessary with this PR.

I see substantially improved multithreaded scaling, although there's still some contention. Looking at the profiles, it seems like the contention is coming from GC pauses?

Here's the profile I record on 3.14b1 and the 3.14 branch with this PR applied, respectively:

https://share.firefox.dev/43bHvhH

https://share.firefox.dev/3GM0Xdu

Here's the profile using multiprocessing:

https://share.firefox.dev/42QR32I

@ngoldbaum
Copy link
Contributor

(this is on a mac so I can't easily get python-level profiles and line numbers in LibCST's Python code)

@nascheme
Copy link
Member Author

nascheme commented May 8, 2025

Thanks for the testing. I tested LibCST on Linux and also see a performance improvement. Running the following command in the numpy source folder:

python3 -m libcst.tool codemod --no-format strip_strings_from_types.StripStringsCommand numpy/_core

I get the elapsed run times:

base 3.13, getattr(): 30.02 sec
base 3.13, type_lookup(): 20.94 sec
3.13 + this PR, getattr(): 18.48 sec

@nascheme nascheme marked this pull request as ready for review May 9, 2025 19:11
@nascheme nascheme requested a review from markshannon as a code owner May 9, 2025 19:11
@nascheme nascheme requested a review from colesbury May 9, 2025 19:12
@nascheme
Copy link
Member Author

nascheme commented May 9, 2025

@colesbury This uses unicode string hash/compare instead of using the string pointer value. Unlike your suggestion, this doesn't use a separate lookup loop for the non-interned case, it just always uses the string hash/compare. pyperformance results show a small slowdown (0.4 %?), I was expecting worse since the non-interned case is so uncommon.

I can try a separate loop if you think that's worth pursing. The advantage of this approach is that it's fairly simple code-wise and I think would be a candidate to backport to 3.13 and 3.14. Perhaps for 3.15 we should try a per-thread cache.

Handle common cases early.
Copy link
Contributor

@colesbury colesbury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should do it this way. I don't think it's worth suffering even a small performance penalty for a rare case (non-interned lookup keys), when we can support that without any performance hit.

@@ -0,0 +1,2 @@
For free-threaded build, allow non-interned strings to be cached in the type
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this caches non-interned strings. It seems to me that it allows non-interned strings as the lookup key, but the cache still only contains interned strings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants