-
Notifications
You must be signed in to change notification settings - Fork 4k
GH-47123: [Python] Add Enums to PyArrow Types #47139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
|
lidavidm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
|
Actually, run-end-encoded is also missing. As are all the binary types |
AlenkaF
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the comments have been addressed. One more suggestion I have is that this enum could be added to the docs. For example to the API doc:
https://github.com/apache/arrow/blob/main/docs/source/python/api/datatypes.rst
next to Type Classes or Utility Functions. Maybe even in the User Guide =) https://github.com/apache/arrow/blob/main/docs/source/python/data.rst
|
ps: the linter error is not connected to the changes in this PR. Not sure if there is an issue open already though. |
The linter issue is solved on main, rebasing main should fix it. |
Thanks for the suggestion! I've put the docs below the Type Checking like so: Partly because the functions above come from the same module, and also because I think this is the intended usage that @lidavidm imagined initially. |
AlenkaF
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for all the updates, this looks good to me!
Will merge once David has one final check.
|
@rmnskb one small issue to fix and then we are good to go! =) ______________________ [doctest] pyarrow.types.TypesEnum _______________________
057
058 Examples
059 --------
060 >>> import pyarrow as pa
061 >>> from pyarrow.types import TypesEnum
062 >>> int8_field = pa.field('int8_field', pa.int8())
063 >>> int8_field.type.id == TypesEnum.INT8
064 True
065
066 >>> fixed_size_list = pa.list(pa.uint16(), 3)
UNEXPECTED EXCEPTION: AttributeError("module 'pyarrow' has no attribute 'list'")
Traceback (most recent call last):
File "/opt/conda/envs/arrow/lib/python3.10/doctest.py", line 1350, in __run
exec(compile(example.source, filename, "single",
File "<doctest pyarrow.types.TypesEnum[4]>", line 1, in <module>
AttributeError: module 'pyarrow' has no attribute 'list'. Did you mean: 'list_'? |
|
@AlenkaF fixed the typo :) |
|
Thank you @rmnskb ! |
|
After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit 2949fe8. There weren't enough matching historic benchmark results to make a call on whether there were regressions. The full Conbench report has more details. |

Rationale for this change
Please see Github Issue #47123
What changes are included in this PR?
Added public Type Enums that mimic the original private variable groups used for internal type checking.
Are these changes tested?
Yes. Partly for now.
Are there any user-facing changes?
No, just additional features were added: they will now able to access the underlying types directly via the Type Enums.