-
-
Notifications
You must be signed in to change notification settings - Fork 18.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add some type annotations io/formats/format.py #27418
add some type annotations io/formats/format.py #27418
Conversation
pandas/io/formats/format.py
Outdated
@@ -172,7 +182,7 @@ def to_string(self): | |||
|
|||
fmt_values = self._get_formatted_values() | |||
|
|||
result = ["{i}".format(i=i) for i in fmt_values] | |||
result = ["{i}".format(i=i) for i in fmt_values] # type: Union[str, List[str]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it be cleaner if this were just List[str]
and then on 187 use a different variable name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
refactored.
pandas/io/formats/format.py
Outdated
def __init__( | ||
self, | ||
categorical: ABCCategorical, | ||
buf: Optional[StringIO] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor but I think should use typing.TextIO here instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably, but it's not yet failing mypy, so doing it 'gradually'. will change on next commit.
pandas/io/formats/format.py
Outdated
dtype: bool = True, | ||
max_rows: Optional[int] = None, | ||
min_rows: Optional[int] = None, | ||
) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can remove this for init - pseudo-standard internally not to add (though was required by earlier mypy versions)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
pandas/io/formats/format.py
Outdated
@@ -237,13 +243,11 @@ def _chk_truncate(self): | |||
else: | |||
row_num = max_rows // 2 | |||
series = concat((series.iloc[:row_num], series.iloc[-row_num:])) | |||
self.tr_row_num = row_num | |||
else: | |||
self.tr_row_num = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just not required anymore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no. see OP.
pandas/io/formats/format.py
Outdated
@@ -498,9 +502,15 @@ def __init__( | |||
else: | |||
self.columns = frame.columns | |||
|
|||
self.truncate_h = None # type: Optional[int] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this gets assigned to an int subsequently and None
isn't valid I think should just type as int
.
There's some syntax trickery at play here due to py 3.5 support as in py36 forward you could declare this without initializing I think like:
self.truncate_h: int
self.truncate_v: int
But can't do that in 3.5
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
None
is valid, so that we get the same behaviour as False
when it was bool
None
is when the DataFrame is not truncated. the assigned int
is the truncated row/column number.
pandas/io/formats/format.py
Outdated
frame = self.frame | ||
if truncate_h: | ||
if max_cols_adj == 0: | ||
col_num = len(frame.columns) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this not a valid path?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
truncate_h
is defined on L548. if max_cols_adj==0
, truncate_h
is False
?
@WillAyd to reduce the diff and confusion, i've added |
pandas/io/formats/format.py
Outdated
@@ -217,16 +223,19 @@ def __init__( | |||
|
|||
self._chk_truncate() | |||
|
|||
def _chk_truncate(self): | |||
@property | |||
def truncate_v(self) -> bool: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have to excuse my lack of knowledge with this module but this isn't entirely consistent with the previous logic right? Any risk in implementing this differently?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed properties
pandas/io/formats/format.py
Outdated
@@ -32,6 +33,8 @@ | |||
is_timedelta64_dtype, | |||
) | |||
from pandas.core.dtypes.generic import ( | |||
ABCCategorical, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a blocker but generally have been moving away from using the ABC classes in annotations as they don't actually represent anything statically and just get replaced with Any
.
Does importing the actual object impact things negatively here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed ABCs
lgtm. @WillAyd |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks very good. Minor stylistic nits and a question but otherwise lgtm
@@ -227,6 +238,7 @@ def _chk_truncate(self): | |||
truncate_v = max_rows and (len(self.series) > max_rows) | |||
series = self.series | |||
if truncate_v: | |||
max_rows = cast(int, max_rows) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this required? Generally we try to avoid cast and I'm surprised MyPy can't narrow the type down here from None.
If it's fully required does making line 238 say max_rows is not None and ...
get around that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's fully required does making line 238 say
max_rows is not None and ...
get around that?
no. gives pandas\io\formats\format.py:249: error: Unsupported operand types for // ("None" and "int")
and there might be risk in implementing this differently anyway?
Generally we try to avoid cast
well I would have agreed with you when I started this, but after this PR, I think that adding type annotations should be just that and any refactoring or code cleanup should be deferred to a follow on PR.
I'm surprised MyPy can't narrow the type down here from None.
Agreed. hence the numerous iterations to try and avoid cast originally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm OK yea seems like a bug / gap with mypy inference then. Can always come back to clean up
pandas/io/formats/html.py
Outdated
@@ -79,7 +79,7 @@ def row_levels(self) -> int: | |||
# not showing (row) index | |||
return 0 | |||
|
|||
def _get_columns_formatted_values(self) -> ABCIndex: | |||
def _get_columns_formatted_values(self) -> Iterable[Any]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can just make this Iterable
- if you don't subscript the generics they default to inclusive of Any
anyway
pandas/io/formats/html.py
Outdated
@@ -162,7 +162,7 @@ def _write_cell( | |||
|
|||
def write_tr( | |||
self, | |||
line: List[str], | |||
line: Iterable[Any], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same thing here for Iterable
Thanks @simonjayhawkins ! |
thanks @WillAyd for the review. |
typing not added to all function signatures in io/formats/format.py in this pass to reduce the diff
this PR also contains a refactor to simplify the addition of type annotations;truncate_h
andtr_col_num
are dependant variables.truncate_h
can beTrue
orFalse
. ifTrue
,tr_col_num
is anint
else iftruncate_h
isFalse
,tr_col_num
isNone
same applies fortruncate_v
andtr_row_num