-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: find categorical code against categorical label/value #48766
Comments
why? what are you actually trying to do the codes are an implementation detail |
@jreback As I explain, to select rows above or below a certain code when you have a ordered categorical column. |
these labels should already respond to the full suite of comparators eg df[df.ordered_cat > 'value1'] should select values that are greater than in code space |
Indeed, you could do a semantic selection with a categorical, but it might still be helpful, let's say 3 quarters after ... |
again these are an implementation detail - you can use them but -1 on adding api beyond which already exists the semantic selections are pretty useful here ; it's not clear why you cannot simply use these |
It does not exist... the codes has more use cases than just an implementation detail. For example, if you need to run a regression mode, you can simply use |
I think this could also be useful when you want to maintain a CategoricalDtype for roundtripping some IO formats. With SQL as an example, the CategoricalDtype does the "right thing" when you just build a dataframe and write it, but if you want to issue a WHERE clause on return that only filtered to a subset of your Dtype it becomes difficult to get access to those codes. I could see it being useful for CategoricalDtype to behave more like an Enum in this instance |
@StevenLi-DS to clarify this is what I have in mind: enum.Enum("AnEnum", cat) Currently this yields |
Thank you @WillAyd. I'm not familiar with enum, but think return a dict would give us more usability and flexibility. |
Hello, this issue is available to take? |
@JgLemos sure still open |
take |
Feature Type
Adding new functionality to pandas
Changing existing functionality in pandas
Removing existing functionality in pandas
Problem Description
I wish I could check the underlying code for each value against a categorical column directly without indexing and using
cat.codes
Assume I have the following dataframe
I need to select all the rows after
2020Q2
. I have to first find the underlying code of the value/label2020Q2
, but I can only do so by indexing the dataframe against it and then usecat.codes
, and then indexing the array return to get the first value. This is a little bit tedious.Feature Description
Right now if you use
df.quarter.dtype.categories
, it only returns the categories as a listIt would be great if there is a attribute to return a map of categories and codes together in a dictionary so that users could simply find the codes by using categories as dict keys
For example
returns
Alternative Solutions
Maybe it could also be a
get_cat_code()
function in pandas api so that users could input a category to get the underlying code, such asget_cat_code(cat='2020Q2')
Additional Context
No response
The text was updated successfully, but these errors were encountered: