-
Notifications
You must be signed in to change notification settings - Fork 3.9k
Description
I understand that, in general, pyarrow does not support type hints. However, I think it is still sensible to add a py.typed marker file to the library. Let me demonstrate why,
$ pip install mypy pyarrow # test.py
import pyarrow as pa
table = pa.Table()
reveal_type(table) $ mypy test.py
test.py:1: error: Skipping analyzing "pyarrow": module is installed, but missing library stubs or py.typed marker
test.py:1: note: See https://mypy.readthedocs.io/en/stable/running_mypy.html#missing-imports
test.py:5: note: Revealed type is "Any"
Found 1 error in 1 file (checked 1 source file) Note that mypy identifies table as being an Any type, when obviously it is a {}Table{}. If we include a py.typed file, mypy will be able to make these trivial inferences. The motivating example is this,
@overload
def from_arrow(a: pa.Table) -> DataFrame:
...
@overload
def from_arrow(a: pa.Array | pa.ChunkedArray) -> Series:
...
def from_arrow(a: pa.Table | pa.Array | pa.ChunkedArray) -> DataFrame | Series:
pass The problem is that since all of {}pa.Table{}, {}pa.Array{}, and pa.ChunkedArray are determined to be {}Any{}, the overloads effectively become
@overload
def from_arrow(a: Any) -> DataFrame:
...
@overload
def from_arrow(a: Any) -> Series:
... and mypy complains that overload 2 is covered entirely by overload 1.
I tried to test what adding a py.typed file would do, but I ran into compilation issues. I was hoping someone with a little more experience here could quickly test this out for me :)
Reporter: Matteo Santamaria
Note: This issue was originally created as ARROW-17901. Please see the migration documentation for further details.