-
-
Notifications
You must be signed in to change notification settings - Fork 18.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
REF: rename 'labels' in pd.factorize to 'codes' #29509
Conversation
e1e850f
to
d90c8db
Compare
d90c8db
to
2f50cd6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure I'd be on board with this
# Implementation notes: This method is responsible for 3 things | ||
# 1.) coercing data to array-like (ndarray, Index, extension array) | ||
# 2.) factorizing labels and uniques | ||
# 3.) Maybe boxing the output in an Index | ||
# 2.) factorizing codes and uniques |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should remain labels
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're factorizing the values into codes and uniques, so should be codes
?
Maybe the sentence should actually be worded as "2) factorizing values into codes and uniques" (more explicit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
codes
is correct; we are changing on purpose to conform with other usages in the codebase
# Implementation notes: This method is responsible for 3 things | ||
# 1.) coercing data to array-like (ndarray, Index, extension array) | ||
# 2.) factorizing labels and uniques | ||
# 3.) Maybe boxing the output in an Index | ||
# 2.) factorizing codes and uniques |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
codes
is correct; we are changing on purpose to conform with other usages in the codebase
thanks @topper-123 |
…ndexing-1row-df * upstream/master: (109 commits) stronger typing in libreduction (pandas-dev#29502) API: rename labels to codes (pandas-dev#29509) CLN: remove unnecessary type checks (pandas-dev#29517) implement _BaseGrouper (pandas-dev#29520) CLN: F-string formatting in pandas/_libs/*.pyx (pandas-dev#29527) Fixed more SS03 errors (pandas-dev#29540) consolidate dim checks (pandas-dev#29536) REF: separate out _get_cython_func_and_vals (pandas-dev#29537) remove unnecessary exception (pandas-dev#29538) TST:Add test to check single category col returns series with single row slice (pandas-dev#29521) Make color validation more forgiving (pandas-dev#29122) DOC: update bottleneck repo and documentation urls (pandas-dev#29516) TST: add test for df construction from dict with tuples (pandas-dev#29497) add test for pd.melt dtypes preservation (pandas-dev#29510) updated DataFrame.equals docstring (pandas-dev#29496) Resolved merge conflicts (pandas-dev#29506) DOC: Improved pandas/compact/__init__.py (pandas-dev#29507) DOC: Update performance comparison section of io docs (pandas-dev#28890) TST: add test for df.where() with category dtype (pandas-dev#29454) DOC: Fix docs on merging categoricals. (pandas-dev#28185) ...
In
factorize
and related stuff, renamelabels
tocodes
. Also add type hints tofactorize
.This is an internal name change, so is ok from an API stability perspective