-
Notifications
You must be signed in to change notification settings - Fork 925
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change IPv4 convert APIs to support UINT32 instead of INT64 #16489
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Python changes look good
if self.dtype != cudf.dtype("int64"): | ||
raise TypeError("Only int64 type can be converted to ip") | ||
if self.dtype != cudf.dtype("uint32"): | ||
raise TypeError("Only uint32 type can be converted to ip") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this method be public or private? It seems to be a superset of the pandas API, which makes me wonder if it should be exposed here. It's already only accessible via the column class which is not really user facing. We could instead expose the functionality through pylibcudf which would keep an option open for anyone who needs the feature to use it without depending on cudf python.
If we do want it to be usable through cudf python, I think it should be promoted to a series method and given docs in the python layer, etc. Otherwise I think we should consider deprecating the python api and moving towards an approach more like the above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is exposed through pylibcudf as far as I know
cudf/python/cudf/cudf/_lib/string_casting.pyx
Line 661 in 8068a2d
def int2ip(Column input_col): |
I'm not able to assess promoting to a series method, etc. It seems a reasonable suggestion to me but probably beyond my skill. So that may need to be done as a separate PR by someone who knows what entails I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
python approval, one non blocking question
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
/merge |
Description
Changes the integer type for
cudf::strings::ipv4_to_integers
andcudf::strings::integers_to_ipv4
to use UINT32 types instead of INT64. The INT64 type was originally chosen because libcudf did not support unsigned types at the time.This is a breaking change since the basic input/output type is changed.
Closes #16324
Checklist