faster counts_by_id for pandas #83

jmoralez · 2024-05-08T18:12:32Z

Uses pd.Series.value_counts(dropna=False, sort=False) instead of df.groupby(col, observed=True).size() which is over 2x faster.

review-notebook-app · 2024-05-08T18:12:37Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

faster counts_by_id for pandas

f411ca7

jmoralez marked this pull request as ready for review May 8, 2024 18:14

jmoralez added the enhancement New feature or request label May 8, 2024

bump version

37638cc

jmoralez merged commit 9eb7024 into main May 8, 2024
18 checks passed

jmoralez deleted the faster-idcounts branch May 8, 2024 18:25

Provide feedback