-
Notifications
You must be signed in to change notification settings - Fork 3.9k
Description
mypy and static type checking
As of Python3.6, it has been possible to introduce typing information in the code. This became immensely popular in a short period of time. Shortly after, the tool mypy arrived and this has become the industry standard for static type checking inside Python. It is able to check very quickly for invalid types which makes it possible to serve as a pre-commit. It has raised many bugs that I did not see myself and has been a very valuable tool.
Now what does this mean for PyArrow?
When we run mypy on code that uses PyArrow, you will get error message as follows:
some_util_using_pyarrow/hdfs_utils.py:5: error: Skipping analyzing "pyarrow": module is installed, but missing library stubs or py.typed marker
some_util_using_pyarrow/hdfs_utils.py:9: error: Skipping analyzing "pyarrow": module is installed, but missing library stubs or py.typed marker
some_util_using_pyarrow/hdfs_utils.py:11: error: Skipping analyzing "pyarrow.fs": module is installed, but missing library stubs or py.typed marker
More information is available here: https://mypy.readthedocs.io/en/stable/running_mypy.html#missing-library-stubs-or-py-typed-marker
You can solve this in three ways:
-
Ignore the message. This, however, will put all types from PyArrow to
Any, making it unable to find user errors with the PyArrow library -
Create a Python stub file. This is what previously used to be the standard, however, it no longer a popular option. This is because stubs are extra, next to the source code, while you can also inline the code with type hints, which brings me to our third option.
-
Create a
py.typedfile and use inline type hints. This is the most popular option today because it requires no extra files (except for the py.typed file), allows all the type hints to be with the code (like now in the documentation) and not only provides your users but also the developers of the library themselves with type hints (and hinting of issues inside your IDE).My personal opinion already shines through the options, it is 3 as this has shortly become the industry standard since the introduction.
What should we do?
I'd very much like to work on this, however, I don't feel like wasting time. Therefore, I am raising this ticket to see if this had been considered before or if we just didn't get to this yet.
I'd like to open the discussion here:
-
Do you agree with number ARROW-10: Fix mismatch of javadoc names and method parameters #3 as type hints.
-
Should we remove the documentation annotations for the type hints given they will be inside the functions? Or should we keep it and specify it in the code? Which would make it double.
Reporter: Jorrick Sleijster
Note: This issue was originally created as ARROW-17335. Please see the migration documentation for further details.