-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
REF: implement Categorical._validate_listlike #36274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -1716,6 +1716,35 @@ def _box_func(self, i: int): | |||
return np.NaN | |||
return self.categories[i] | |||
|
|||
def _validate_listlike(self, target: ArrayLike) -> np.ndarray: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks great, yeah these have probably built up over time. only nit is that this should not be private as we are cross importing right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only nit is that this should not be private as we are cross importing right?
for now im following the patterns that we use in the datetimelike arrays so we can share more code. eventually we can revisit whether to deprivatize (easier to deprivatize than the other way around)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kk
i doubt this makes any perf difference, but maybe worth checking. |
if you can check perf, ping when ready |
|
thanks @jbrockmendel |
I found it really confusing why we used slightly different dtype-comparisons in a bunch of different places (e.g CategoricalIndex.get_indexer vs CategoricalIndex.get_indexer_non_unique). AFAICT It's just a matter of optimizations accruing over time without a helper method to keep them in sync.