Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PERF: Index.join to maintain cached attributes in more cases #57023

Merged
merged 13 commits into from
Jan 24, 2024

Conversation

lukemanley
Copy link
Member

import pandas as pd

data = [f"i-{i:05}" for i in range(100_000)]
dtype = "string[pyarrow_numpy]"

idx1 = pd.Index(data, dtype=dtype)
idx2 = pd.Index(data[1:], dtype=dtype)

# the is_unique call at the end is cached in this PR
%timeit idx1.join(idx2, how="outer").is_unique

# 59.1 ms ± 1.29 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)  -> main
# 41.9 ms ± 894 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)   -> PR

@lukemanley lukemanley added Performance Memory or execution speed performance Index Related to the Index class or subclasses labels Jan 23, 2024
@lukemanley lukemanley added this to the 3.0 milestone Jan 23, 2024
pandas/core/frame.py Outdated Show resolved Hide resolved
@mroeschke mroeschke merged commit 622f31c into pandas-dev:main Jan 24, 2024
49 of 50 checks passed
@mroeschke
Copy link
Member

Thanks @lukemanley

pmhatre1 pushed a commit to pmhatre1/pandas-pmhatre1 that referenced this pull request May 7, 2024
…dev#57023)

* Index.join result name

* whatsnew

* update test

* Index._wrap_join_result to maintain cached attributes if possible

* Index._wrap_join_result to maintain cached attributes if possible

* whatsnew

* allow indexers to be None

* gh ref

* rename variables for clarity
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Index Related to the Index class or subclasses Performance Memory or execution speed performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants