Skip to content

Fix typing for extension arrays and extension dtypes without isin and astype #40421

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 43 commits into from
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
f2c52a4
small typing fixes
Dr-Irv Jan 23, 2021
d7ff8d3
fix ExtensionArray and EXtensionDtype
Dr-Irv Jan 23, 2021
49fa06e
merge with master
Dr-Irv Jan 31, 2021
03b2c4a
fixes for delete, isin, unique
Dr-Irv Jan 31, 2021
3e19958
fix import of Literal
Dr-Irv Jan 31, 2021
6861901
remove quotes on ExtensionDType.construct_from_string
Dr-Irv Jan 31, 2021
9be6486
move numpy workaround to _typing.py
Dr-Irv Feb 1, 2021
260b367
remove numpy dummy
Dr-Irv Feb 2, 2021
6276725
remove extra line in _typing
Dr-Irv Feb 2, 2021
4dafaca
Merge remote-tracking branch 'upstream/master' into extensiontyping
Dr-Irv Feb 3, 2021
8b2cee2
import Literal
Dr-Irv Feb 3, 2021
3a7d839
Merge remote-tracking branch 'upstream/master' into extensiontyping
Dr-Irv Feb 14, 2021
a21bb60
merge with master
Dr-Irv Mar 8, 2021
8cd6b76
isort precommit fix
Dr-Irv Mar 8, 2021
e0e0131
fix interval.repeat() typing
Dr-Irv Mar 8, 2021
6a6a21f
overload for __getitem__ and use pattern with ExtensionArrayT as self…
Dr-Irv Mar 9, 2021
bf753e6
lose less ExtensionArrayT. Make registry private. consolidate overload
Dr-Irv Mar 10, 2021
c9795a5
remove ExtensionArray typing of self
Dr-Irv Mar 10, 2021
d452842
Merge remote-tracking branch 'upstream/master' into extensiontyping
Dr-Irv Mar 10, 2021
3c2c78b
merge with upstream/master
Dr-Irv Mar 12, 2021
548c198
make extension arrays work with new typing, fixing astype and to_numpy
Dr-Irv Mar 12, 2021
db8ed9b
fix Literal import
Dr-Irv Mar 12, 2021
f8191f8
fix logic in ensure_int_or_float
Dr-Irv Mar 12, 2021
575645f
fix conflict with master
Dr-Irv Mar 12, 2021
6f8fcb5
fix typing in groupby to_numpy call
Dr-Irv Mar 12, 2021
3ea2420
fix groupby again. Allow kwargs for extension to_numpy
Dr-Irv Mar 13, 2021
c83a628
Merge remote-tracking branch 'upstream/master' into extensiontyping
simonjayhawkins Mar 13, 2021
5bb24d4
fixes for merge with master
Dr-Irv Mar 13, 2021
ad1ab3b
remove astype and isin changes
Dr-Irv Mar 13, 2021
63c3d6d
add comment to cast in managers. change return type of astype
Dr-Irv Mar 13, 2021
1882074
add 0 as argument for repeat`
Dr-Irv Mar 13, 2021
1274a76
remove kwargs from to_numpy
Dr-Irv Mar 13, 2021
363e203
remove more kwargs from to_numpy calls
Dr-Irv Mar 13, 2021
01c942c
Merge remote-tracking branch 'upstream/master' into limitextensiontyping
Dr-Irv Mar 13, 2021
1196132
don't cast in astype. TODO for overload of astype
Dr-Irv Mar 13, 2021
66d5da4
remove private registry, getitem overloads, typevar on DateTimeScalar
Dr-Irv Mar 13, 2021
5411998
Remove List[Any] from getitem
Dr-Irv Mar 13, 2021
9b7481d
remove spacing change in _mixins.py and __getitem__
Dr-Irv Mar 13, 2021
4bd3422
remove cast in io/formats. Change isinstance check in pandas.core.ba…
Dr-Irv Mar 14, 2021
3b1ff79
merge with master to resolve conflicts
Dr-Irv Mar 14, 2021
5719daa
merge with master, remove more ignores
Dr-Irv Apr 3, 2021
dbfb3a2
remove mypy comments from format
Dr-Irv Apr 3, 2021
d41bf91
resolve conflicts with master
Dr-Irv Apr 27, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
resolve conflicts with master
  • Loading branch information
Dr-Irv committed Apr 27, 2021
commit d41bf916995cdb9e3018728e5e04ce995331629d
25 changes: 11 additions & 14 deletions pandas/core/arrays/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,9 +13,7 @@
TYPE_CHECKING,
Any,
Callable,
Dict,
Iterator,
Optional,
Sequence,
TypeVar,
cast,
Expand All @@ -28,6 +26,7 @@
ArrayLike,
Dtype,
NpDtype,
PositionalIndexer,
Shape,
)
from pandas.compat import set_function_name
Expand Down Expand Up @@ -389,7 +388,7 @@ def __iter__(self) -> Iterator[Any]:
for i in range(len(self)):
yield self[i]

def __contains__(self, item: Any) -> Union[bool, np.bool_]:
def __contains__(self, item) -> bool | np.bool_:
"""
Return for `item in self`.
"""
Expand Down Expand Up @@ -428,9 +427,9 @@ def __ne__(self, other: Any) -> ArrayLike: # type: ignore[override]

def to_numpy(
self,
dtype: Optional[NpDtype] = None,
dtype: NpDtype | None = None,
copy: bool = False,
na_value: Optional[Any] = lib.no_default,
na_value: Any | None = lib.no_default,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None seems redundant here.

) -> np.ndarray:
"""
Convert to a NumPy ndarray.
Expand Down Expand Up @@ -683,9 +682,9 @@ def argmax(self, skipna: bool = True) -> int:

def fillna(
self,
value: Optional[Union[Any, ArrayLike]] = None,
method: Optional[Literal["backfill", "bfill", "ffill", "pad"]] = None,
limit: Optional[int] = None,
value: Any | ArrayLike | None = None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't the ArrayLike (and None) redundant?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes and no. We want to indicate that you can provide a single value (but for an ExtensionArray that could be of the type of values stored in the array), or an array that indicates specific values to put where there are NA values.

method: Literal["backfill", "bfill", "ffill", "pad"] | None = None,
limit: int | None = None,
) -> ExtensionArray:
"""
Fill NA/NaN values using the specified method.
Expand Down Expand Up @@ -818,7 +817,7 @@ def searchsorted(
self,
value: Sequence[Any],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sequence is not yet compatible with EA (and others). see #28770

side: Literal["left", "right"] = "left",
sorter: Optional[Sequence[Any]] = None,
sorter: Sequence[Any] | None = None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from the docstring sorter is an arraylike of ints.

) -> np.ndarray:
"""
Find indices where elements should be inserted to maintain order.
Expand Down Expand Up @@ -1042,7 +1041,7 @@ def factorize(self, na_sentinel: int = -1) -> tuple[np.ndarray, ExtensionArray]:
@Appender(_extension_array_shared_docs["repeat"])
def repeat(
self,
repeats: Union[int, Sequence[int]],
repeats: int | Sequence[int],
axis: Literal[None, 0] = None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not seen Literal[None] used b4.

) -> ExtensionArray:
nv.validate_repeat((), {"axis": axis})
Expand Down Expand Up @@ -1241,9 +1240,7 @@ def transpose(self, *axes: int) -> ExtensionArray:
def T(self) -> ExtensionArray:
return self.transpose()

def ravel(
self, order: Optional[Literal["C", "F", "A", "K"]] = "C"
) -> ExtensionArray:
def ravel(self, order: Literal["C", "F", "A", "K"] | None = "C") -> ExtensionArray:
"""
Return a flattened view on this array.

Expand Down Expand Up @@ -1323,7 +1320,7 @@ def __hash__(self) -> int:
# ------------------------------------------------------------------------
# Non-Optimized Default Methods

def delete(self, loc: Union[int, Sequence[int]]) -> ExtensionArray:
def delete(self, loc: int | Sequence[int]) -> ExtensionArray:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please revert the typevar change.

indexer = np.delete(np.arange(len(self)), loc)
return self.take(indexer)

Expand Down
4 changes: 2 additions & 2 deletions pandas/core/arrays/boolean.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
from pandas._typing import (
ArrayLike,
Dtype,
DtypeArg,
type_t,
)
from pandas.compat.numpy import function as nv

Expand Down Expand Up @@ -309,7 +309,7 @@ def dtype(self) -> BooleanDtype:

@classmethod
def _from_sequence(
cls, scalars, *, dtype: Optional[DtypeArg] = None, copy: bool = False
cls, scalars, *, dtype: Dtype | None = None, copy: bool = False
) -> BooleanArray:
if dtype:
assert dtype == "boolean"
Expand Down
3 changes: 1 addition & 2 deletions pandas/core/arrays/interval.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,6 @@
from typing import (
Sequence,
TypeVar,
Union,
cast,
)

Expand Down Expand Up @@ -1519,7 +1518,7 @@ def delete(self: IntervalArrayT, loc) -> IntervalArrayT:

@Appender(_extension_array_shared_docs["repeat"] % _shared_docs_kwargs)
def repeat(
self: IntervalArrayT, repeats: Union[int, Sequence[int]], axis=None
self: IntervalArrayT, repeats: int | Sequence[int], axis=None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this compatible with numpy where the type is int or array of ints?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compatible in what sense? numpy takes a superset of Sequence[int] as the repeats argument. Ideally, we'd use the numpy argtype for repeats throughout, but we can't depend on numpy 1.20 being available

) -> IntervalArrayT:
nv.validate_repeat((), {"axis": axis})
left_repeat = self.left.repeat(repeats)
Expand Down
6 changes: 3 additions & 3 deletions pandas/core/arrays/string_arrow.py
Original file line number Diff line number Diff line change
Expand Up @@ -227,15 +227,15 @@ def _chk_pyarrow_available(cls) -> None:
raise ImportError(msg)

@classmethod
def _from_sequence(cls, scalars, dtype: Optional[DtypeArg] = None, copy=False):
def _from_sequence(cls, scalars, dtype: DtypeArg | None = None, copy: bool = False):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DtypeArg?

cls._chk_pyarrow_available()
# convert non-na-likes to str, and nan-likes to ArrowStringDtype.na_value
scalars = lib.ensure_string_array(scalars, copy=False)
return cls(pa.array(scalars, type=pa.string(), from_pandas=True))

@classmethod
def _from_sequence_of_strings(
cls, strings, dtype: Optional[DtypeArg] = None, copy=False
cls, strings, dtype: DtypeArg | None = None, copy: bool = False
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

):
return cls._from_sequence(strings, dtype=dtype, copy=copy)

Expand Down Expand Up @@ -693,7 +693,7 @@ def value_counts(self, dropna: bool = True) -> Series:

_str_na_value = ArrowStringDtype.na_value

def _str_map(self, f, na_value=None, dtype: Dtype | None = None):
def _str_map(self, f, na_value=None, dtype: DtypeArg | None = None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

# TODO: de-duplicate with StringArray method. This method is moreless copy and
# paste.

Expand Down
2 changes: 1 addition & 1 deletion pandas/core/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -436,7 +436,7 @@ def array(self) -> ExtensionArray:

def to_numpy(
self,
dtype: Optional[NpDtype] = None,
dtype: NpDtype | None = None,
copy: bool = False,
na_value=lib.no_default,
**kwargs,
Expand Down
14 changes: 5 additions & 9 deletions pandas/core/dtypes/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,7 @@
from typing import (
TYPE_CHECKING,
Any,
List,
Optional,
Tuple,
Type,
Union,
TypeVar,
cast,
)

Expand All @@ -33,7 +29,7 @@
from pandas.core.arrays import ExtensionArray

# To parameterize on same ExtensionDtype
E = TypeVar("E", bound="ExtensionDtype")
ExtensionDtypeT = TypeVar("ExtensionDtypeT", bound="ExtensionDtype")


class ExtensionDtype:
Expand Down Expand Up @@ -160,7 +156,7 @@ def na_value(self) -> object:
return np.nan

@property
def type(self) -> type_t[Any]:
def type(self) -> type[Any]:
"""
The scalar type for the array, e.g. ``int``

Expand Down Expand Up @@ -373,7 +369,7 @@ def _get_common_dtype(self, dtypes: list[DtypeObj]) -> DtypeObj | None:
return None


def register_extension_dtype(cls: type[E]) -> type[E]:
def register_extension_dtype(cls: type[ExtensionDtypeT]) -> type[ExtensionDtypeT]:
"""
Register an ExtensionType with pandas as class decorator.

Expand Down Expand Up @@ -429,7 +425,7 @@ def register(self, dtype: type[ExtensionDtype]) -> None:

self.dtypes.append(dtype)

def find(self, dtype: Union[Type[ExtensionDtype], str]) -> Optional[ExtensionDtype]:
def find(self, dtype: type[ExtensionDtype] | str) -> ExtensionDtype | None:
"""
Parameters
----------
Expand Down
5 changes: 4 additions & 1 deletion pandas/core/indexes/extension.py
Original file line number Diff line number Diff line change
Expand Up @@ -337,7 +337,10 @@ def _get_unique_index(self):
return self

result = self._data.unique()
return type(self)._simple_new(result, name=self.name)
# error: Argument 1 to "_simple_new" of "ExtensionIndex" has incompatible
# type "ExtensionArray"; expected "Union[IntervalArray,
# NDArrayBackedExtensionArray]"
return type(self)._simple_new(result, name=self.name) # type: ignore[arg-type]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revealed type of result will be Union[pandas.core.arrays.interval.IntervalArray*, pandas.core.arrays._mixins.NDArrayBackedExtensionArray*]' i.e. same type as self._data if unique is typed using a TypeVar and this ignore can then be removed.


@doc(Index.map)
def map(self, mapper, na_action=None):
Expand Down
3 changes: 1 addition & 2 deletions pandas/core/internals/managers.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@
Hashable,
Sequence,
TypeVar,
Union,
cast,
)
import warnings
Expand Down Expand Up @@ -629,7 +628,7 @@ def copy_func(ax):
def as_array(
self,
transpose: bool = False,
dtype: Optional[NpDtype] = None,
dtype: NpDtype | None = None,
copy: bool = False,
na_value=lib.no_default,
) -> np.ndarray:
Expand Down
You are viewing a condensed version of this merge commit. You can view the full changes here.