-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PERF: avoid unnecessary method call in get_indexer_non_unique() on MultiIndex #55811
PERF: avoid unnecessary method call in get_indexer_non_unique() on MultiIndex #55811
Conversation
b0b4b49
to
ebfff8e
Compare
Thank you, I've done some further analysis comparing to the previous PR that fixed the first code path. I think the PR is good to merge to avoid the extra function call, though it hardly makes any difference on performance. on a side note, there appear to be a performance regression in reindex between 1.5.3 and 2.0.3 by as much as 10%
(ignore the 1ms difference that is noise between runs)
|
Thanks. Can you remove the whatsnew note in that case and maybe retitle this PR to something like: "CLN: avoid unnecessary method call in ..." |
ebfff8e
to
7ca470d
Compare
adjusted commits and PR titles |
Thanks @morotti |
Hello,
There was a regression that made reindex 60x slower, in pandas v0.23 to v1.4, that was finally fixed by this PR last year
discussion #23735
commit https://github.com/pandas-dev/pandas/pull/46235/files
a benchmark was added in a later commit https://github.com/pandas-dev/pandas/pull/47221/files
That resolved the issue in
get_indexer()
.I was going through pandas source code and I noticed there is a second code path in
get_indexer_non_unique()
with the same problem.This PR fixes the remaining code path.
I'm not super familiar with pandas, I'm not sure what might hit the default indexer vs the non unique indexer.
It would be great if you can think of something and add a benchmark.
It looks like the exact same bug and nearly the same code to me, it might just be a 60x performance improvement too 😄
@jreback
@lukemanley
@jbrockmendel
Cheers.
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.