Skip to content
This repository was archived by the owner on Feb 19, 2023. It is now read-only.

disallowed Series | AnyArrayLike as Series is already array-like #28

Merged
merged 1 commit into from
Feb 12, 2022

Conversation

Pydare
Copy link
Contributor

@Pydare Pydare commented Jan 6, 2022

Issue #42359 in the pandas-dev repo

@MarcoGorelli MarcoGorelli self-requested a review January 6, 2022 10:16
@MarcoGorelli
Copy link
Member

Thanks @Pydare for doing this! Looks like CI is red (insufficient coverage), could you fix it up please?

@Pydare
Copy link
Contributor Author

Pydare commented Jan 6, 2022

any idea on what I could do?

@MarcoGorelli
Copy link
Member

check that all the branches in the code you've added are covered by the tests - e.g. put breakpoints in them and check that the tests reach them

@Pydare
Copy link
Contributor Author

Pydare commented Jan 7, 2022

check that all the branches in the code you've added are covered by the tests - e.g. put breakpoints in them and check that the tests reach them

I've been able to see what branches are not well covered, am I to write tests for each of these individual branches? Kind of unsure of what to do from here

Screenshot from 2022-01-07 08-37-13

@MarcoGorelli
Copy link
Member

Hmm looks like for the file you've added, it's already at 100% - the issue's probably somewhere else then, I'll take a closer look tomorrow, it's probably unrelated to this PR

Well done anyway, this generally looks really good!

@MarcoGorelli
Copy link
Member

Looks like coverage for pandas_dev_flaker/_ast_helpers.py is less than 100% - I'd suggest, check that all the code you've added in that file is covered by at least one test

@MarcoGorelli
Copy link
Member

$ coverage report --show-missing
Name                                Stmts   Miss Branch BrPart  Cover   Missing
-------------------------------------------------------------------------------
pandas_dev_flaker/_ast_helpers.py      34      4     22      4    86%   63, 69, 74, 76
-------------------------------------------------------------------------------
TOTAL                                 758      4    220      4    99%

58 files skipped due to complete coverage.
Coverage failure: total of 99 is less than fail-under=100

@Pydare
Copy link
Contributor Author

Pydare commented Jan 10, 2022

Looks like coverage for pandas_dev_flaker/_ast_helpers.py is less than 100% - I'd suggest, check that all the code you've added in that file is covered by at least one test

I suspected this was the problem. I tried looking for the file that contained tests for _ast_helpers.py but couldn't find any.

@MarcoGorelli
Copy link
Member

There's no file that directly tests _ast_helpers.py, but the functions defined in _ast_helpers.py are used in other files which are tested.

So, you need to make sure that the test cases in pandas_dev_flaker/_plugins_tree/disallow_argument_types.py are comprehensive enough that they cover all the code you've added to _ast_helpers.py

@Pydare
Copy link
Contributor Author

Pydare commented Jan 11, 2022

I inspected which lines weren't properly tested in _ast_helpers.py and my new functions do not affect it.

Screenshot from 2022-01-11 09-57-10

@MarcoGorelli
Copy link
Member

MarcoGorelli commented Jan 11, 2022

Looks like there's still a few missing branches:

(venv) marcogorelli@OVMG025 pandas-dev-flaker % coverage report
Name                                Stmts   Miss Branch BrPart  Cover   Missing
-------------------------------------------------------------------------------
pandas_dev_flaker/_ast_helpers.py      30      0     20      4    92%   53->70, 54->60, 63->65, 67->53
-------------------------------------------------------------------------------
TOTAL                                 754      0    212      4    99%

58 files skipped due to complete coverage.
Coverage failure: total of 100 is less than fail-under=100

To see this locally, you can run

coverage erase 
coverage run -m pytest
coverage report

@MarcoGorelli
Copy link
Member

Looks like there's still a few missing branches:

(venv) marcogorelli@OVMG025 pandas-dev-flaker % coverage report
Name                                Stmts   Miss Branch BrPart  Cover   Missing
-------------------------------------------------------------------------------
pandas_dev_flaker/_ast_helpers.py      30      0     20      4    92%   53->70, 54->60, 63->65, 67->53
-------------------------------------------------------------------------------
TOTAL                                 754      0    212      4    99%

58 files skipped due to complete coverage.
Coverage failure: total of 100 is less than fail-under=100

To see this locally, you can run

coverage erase 
coverage run -m pytest
coverage report

If there's a branch which you think can't be covered, then you can raise an AssertionError and it'll be ignored in the coverage report

@codecov
Copy link

codecov bot commented Jan 12, 2022

Codecov Report

Merging #28 (18d81ef) into main (4bd7b6e) will not change coverage.
The diff coverage is 100.00%.

❗ Current head 18d81ef differs from pull request most recent head 185fd23. Consider uploading reports for the commit 185fd23 to get more accurate results

Impacted file tree graph

@@            Coverage Diff            @@
##              main       #28   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           53        55    +2     
  Lines          711       776   +65     
  Branches        89       104   +15     
=========================================
+ Hits           711       776   +65     
Impacted Files Coverage Δ
pandas_dev_flaker/_ast_helpers.py 100.00% <ø> (ø)
...ev_flaker/_plugins_tree/disallow_argument_types.py 100.00% <100.00%> (ø)
tests/disallow_argument_types_test.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4bd7b6e...185fd23. Read the comment docs.

Comment on lines 65 to 66
if node.left.id == any_array_like: # pragma: no cover
is_array_like = True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why can't this branch be covered?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on line 62, if the if condition is true, we're to break out of the while loop. but before breaking out we're also making checks for series and any-array-like annotations, these 2 inner conditions were affecting the test coverage.

the coverage warning: line 65 didn't jump to line 67 because the condition for line 65 was never false

i wrote tests that covered for when line 65 was false. i'm sincerely not really sure why it fails.
both conditions have to be checked because either of them could be the first annotation

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, thanks for explaining - I'll take a look later then

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't a test case like

def dot(foo, other: Series | AnyArrayLike): pass

get here?

Copy link
Contributor Author

@Pydare Pydare Jan 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn't satisfy the coverage warning.

@MarcoGorelli MarcoGorelli self-requested a review January 12, 2022 18:10
or (
sys.version_info >= (3, 8)
and isinstance(node.func.value, ast.Constant)
and isinstance(node.func.value.value, str)
)
)


def parse_annotation(node: ast.BinOp) -> bool:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this function only used in pandas_dev_flaker/_plugins_tree/disallow_argument_types.py? If so, it can go in that file, this one is meant for shared functions

Comment on lines 65 to 66
if node.left.id == any_array_like: # pragma: no cover
is_array_like = True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't a test case like

def dot(foo, other: Series | AnyArrayLike): pass

get here?

Comment on lines 59 to 60
else:
raise AssertionError
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't a test case like

        pytest.param(
            "def p(foo, other: Series | Union[int, str] ): pass",
            id="Function argument with two annotations",
        ),

hit this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this test worked. i was expecting the test i wrote on line 31 in disallow_argument_types.py to have covered such a case

@MarcoGorelli MarcoGorelli self-requested a review January 21, 2022 16:26
Comment on lines 44 to 45
else:
raise AssertionError
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try running this on pandas, this'll be reached

$ flake8 ../pandas-marco/pandas --select PDF026 -j 1
Traceback (most recent call last):
  File "/home/marco/pandas-style-guide/venv/bin/flake8", line 8, in <module>
    sys.exit(main())
  File "/home/marco/pandas-style-guide/venv/lib/python3.8/site-packages/flake8/main/cli.py", line 22, in main
    app.run(argv)
  File "/home/marco/pandas-style-guide/venv/lib/python3.8/site-packages/flake8/main/application.py", line 363, in run
    self._run(argv)
  File "/home/marco/pandas-style-guide/venv/lib/python3.8/site-packages/flake8/main/application.py", line 351, in _run
    self.run_checks()
  File "/home/marco/pandas-style-guide/venv/lib/python3.8/site-packages/flake8/main/application.py", line 264, in run_checks
    self.file_checker_manager.run()
  File "/home/marco/pandas-style-guide/venv/lib/python3.8/site-packages/flake8/checker.py", line 323, in run
    self.run_serial()
  File "/home/marco/pandas-style-guide/venv/lib/python3.8/site-packages/flake8/checker.py", line 307, in run_serial
    checker.run_checks()
  File "/home/marco/pandas-style-guide/venv/lib/python3.8/site-packages/flake8/checker.py", line 589, in run_checks
    self.run_ast_checks()
  File "/home/marco/pandas-style-guide/venv/lib/python3.8/site-packages/flake8/checker.py", line 494, in run_ast_checks
    for (line_number, offset, text, _) in runner:
  File "/home/marco/pandas-style-guide/pandas_dev_flaker/__main__.py", line 22, in run
    for line, col, msg in callbacks_tree:
  File "/home/marco/pandas-style-guide/pandas_dev_flaker/_data_tree.py", line 74, in visit_tree
    yield from ast_func(state, node, parent)
  File "/home/marco/pandas-style-guide/pandas_dev_flaker/_plugins_tree/disallow_argument_types.py", line 17, in visit_arg
    if isinstance(arg.annotation, ast.BinOp) and parse_annotation(
  File "/home/marco/pandas-style-guide/pandas_dev_flaker/_plugins_tree/disallow_argument_types.py", line 45, in parse_annotation
    raise AssertionError
AssertionError

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the example that breaks this:

def maybe_iterable_to_list(obj: Iterable[T] | T) -> Collection[T] | T:

Copy link
Contributor Author

@Pydare Pydare Jan 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wow, wow, wow. i didn't think of this edge case. i'd try to fix it

Comment on lines 42 to 48
elif isinstance(node.left, ast.Name) or isinstance(
node.left,
ast.Subscript,
):
break
else:
raise AssertionError
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is too specific, and can break with

    def __array__(self, dtype: npt.DTypeLike | None = None) -> np.ndarray:

Copy link
Contributor Author

@Pydare Pydare Jan 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could work? there are no coverage warnings also
if not isinstance(node.left, ast.BinOp): break else: node = node.left

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be fine

Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well done 💪 ! I think we've got the logic in parse_annotation down

Last thing (I promise 😄 ) to address is the return type, e.g.

def foo(arg) -> Series | AnyArrayLike

currently isn't caught


There's a couple of possible solutions - either change visit_arg to also check the return type, or by registering two functions, one for ast.Attribute and the other for ast.Name, see

https://github.com/asottile/pyupgrade/blob/fd230e8c64cfd990e1e9231522ed3e4c4e08cb58/pyupgrade/_plugins/typing_pep585.py#L30-L66

for an example

The second solution would be better, because it would also catch annotations outside of function definitions

is_series, is_array_like = False, False

while True:
if isinstance(node.right, ast.Name):
Copy link
Member

@MarcoGorelli MarcoGorelli Jan 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we probably also want a branch like:

if isinstance(node.right, ast.Str):
    if node.right.value == "Series"

to catch

arg: AnyArrayLike | "Series"

(and viceversa)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i wrote some tests with the already existing logic for cases of "Series" and it handles it. i don't think we need extra logic to cover that

Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, comment on naming

anyway, you've done a lot here, if you want me to finish it let me know

yield arg.lineno, arg.col_offset, MSG


def parse_annotation(node: ast.BinOp) -> bool:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, this probably needs a better name, e.g. constains_series_and_arraylike

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you think i should change the function name from visit_args to visit_FunctionDef? seems more appropriate and consistent

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah sounds good

@Pydare
Copy link
Contributor Author

Pydare commented Jan 25, 2022

Well done muscle ! I think we've got the logic in parse_annotation down

Last thing (I promise smile ) to address is the return type, e.g.

def foo(arg) -> Series | AnyArrayLike

currently isn't caught

There's a couple of possible solutions - either change visit_arg to also check the return type, or by registering two functions, one for ast.Attribute and the other for ast.Name, see

https://github.com/asottile/pyupgrade/blob/fd230e8c64cfd990e1e9231522ed3e4c4e08cb58/pyupgrade/_plugins/typing_pep585.py#L30-L66

for an example

The second solution would be better, because it would also catch annotations outside of function definitions

I'm kinda unsure why we should register ast.Atrribute and ast.Name to enable us catch return annotations or annotations outside of function definitions. The first option is clear to me.

From my understanding according to docs, Attribute majorly deals with class attributes or variable attributes, etc, and Name majorly deals with variables.

Just trying to understand how I can link them to catch the annotations outside functions.

The only edge case I can think about for now are variable definitions (ast.AnnAssign). eg
_hidden_attrs: frozenset[str] = NDFrame._hidden_attrs | frozenset([])
_mgr: BlockManager | ArrayManager

@Pydare
Copy link
Contributor Author

Pydare commented Jan 25, 2022

UPDATE: so i broke down the logic into 3 different functions in the same disallow_argument_types.py file. i'm kind of having some trouble testing the 2 new functions. should i split them into 2 different files?

Screenshot from 2022-01-25 12-02-38

@MarcoGorelli
Copy link
Member

I think a single file is fine, as until now I've been keeping it to 1 file for each check

@MarcoGorelli
Copy link
Member

From my understanding according to docs, Attribute majorly deals with class attributes or variable attributes, etc, and Name majorly deals with variables.

In general, my way of checking what to use is to make some test file, like t.py

var: int
def foo(bar: str) -> None: ...

then running

pip install astpretty
astpretty t.py

and checking from the output what the objects are

@MarcoGorelli
Copy link
Member

Nice! I'll take a closer a look at the weekend, but this might be it!

@MarcoGorelli MarcoGorelli self-requested a review February 1, 2022 12:18
@MarcoGorelli MarcoGorelli force-pushed the disallow_series_arraylike branch from 18d81ef to e4572dc Compare February 5, 2022 15:09
@MarcoGorelli MarcoGorelli force-pushed the disallow_series_arraylike branch from e4572dc to 185fd23 Compare February 5, 2022 15:16
Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, I just fixed it up slightly as

: List[Series | AnyArrayLike]

wasn't covered

Any objections with what I've changed? If not, I think we can merge 🚀

@Pydare
Copy link
Contributor Author

Pydare commented Feb 6, 2022

looks good to me too. we can merge :)

@MarcoGorelli
Copy link
Member

Awesome, thanks (cc @jreback for visibility, as I think @Pydare effort has done great work here!)

@MarcoGorelli MarcoGorelli merged commit c2586bb into pandas-dev:main Feb 12, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants