Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-124502: Add PyUnicode_Equal() function #124504

Merged
merged 6 commits into from
Oct 7, 2024
Merged

Conversation

vstinner
Copy link
Member

@vstinner vstinner commented Sep 25, 2024

Doc/c-api/unicode.rst Outdated Show resolved Hide resolved
Doc/whatsnew/3.14.rst Outdated Show resolved Hide resolved
Lib/test/test_capi/test_unicode.py Show resolved Hide resolved
@vstinner
Copy link
Member Author

@picnixz @rruuaanng @ZeroIntensity: I addressed your comments. Please review the updated PR.

Copy link
Contributor

@picnixz picnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Final nitpick but you can ignore it (I don't know whether it's that useful).

Objects/unicodeobject.c Outdated Show resolved Hide resolved
Copy link
Member

@ZeroIntensity ZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! LGTM.

@rruuaanng
Copy link
Contributor

Maybe you are right, But you can consider my suggestion.

@vstinner
Copy link
Member Author

I created capi-workgroup/decisions#43 issue in the C API Working Group.

@vstinner
Copy link
Member Author

vstinner commented Sep 26, 2024

I wrote a microbenchmark comparing equal strings of 10 characters:

Results on Fedora 40 with CPU isolation:

  • (private) _PyUnicode_EQ: Mean +- std dev: 9.71 ns +- 0.04 ns
  • (private) _PyUnicode_Equal: Mean +- std dev: 8.75 ns +- 0.01 ns
  • (public) PyUnicode_Equal: Mean +- std dev: 9.14 ns +- 0.02 ns
  • (public) PyUnicode_Compare: Mean +- std dev: 11.1 ns +- 0.1 ns
  • (public) PyUnicode_RichCompare: Mean +- std dev: 12.2 ns +- 0.0 ns

Proposed public PyUnicode_Equal() is 1.2x faster than PyUnicode_Compare() (-2.0 ns) and 1.3x faster than PyUnicode_RichCompare() (-3.1 ns).

Private _PyUnicode_EQ() and private _PyUnicode_Equal() are not exactly the same implementation, it seems like private _PyUnicode_Equal() is a little bit faster (1 ns!).

@vstinner
Copy link
Member Author

vstinner commented Sep 30, 2024

Microbenchmark on comparison of inequal strings of 10 characters, only the last character is different.

  • (private) _PyUnicode_EQ: Mean +- std dev: 10.0 ns +- 0.0 ns
  • (private) _PyUnicode_Equal: Mean +- std dev: 9.18 ns +- 0.00 ns
  • (public) PyUnicode_Equal: Mean +- std dev: 10.1 ns +- 0.0 ns
  • (public) PyUnicode_Compare: Mean +- std dev: 10.9 ns +- 0.0 ns
  • (public) PyUnicode_RichCompare: Mean +- std dev: 13.1 ns +- 0.0 ns

@vstinner vstinner enabled auto-merge (squash) October 7, 2024 21:07
@vstinner
Copy link
Member Author

vstinner commented Oct 7, 2024

I created capi-workgroup/decisions#43 issue in the C API Working Group.

The C API Working Group approved the API.

@vstinner vstinner merged commit a7f0727 into python:main Oct 7, 2024
37 checks passed
@vstinner vstinner deleted the unicode_equal branch October 7, 2024 21:24
@vstinner
Copy link
Member Author

vstinner commented Oct 7, 2024

Merged. Thanks for reviews @picnixz and @ZeroIntensity.

efimov-mikhail pushed a commit to efimov-mikhail/cpython that referenced this pull request Oct 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants