Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add type inference for dict.keys membership #13372

Merged

Conversation

matthewhughes934
Copy link
Contributor

@matthewhughes934 matthewhughes934 commented Aug 9, 2022

  • Add type inference for dict.keys membership

    Also for containership checks on typing.KeysView. This is to bring the
    following cases into alignment:

      from typing import KeysView
    
      key: str | None
      d: dict[str, int]
      kv: KeysView[str]
    
      if key in d:
          # type of 'key' is inferred to be 'str'
          ...
    
      if key in d.keys():
          # type of 'key' should be inferred to be 'str'
          ...
    
      if key in kv:
          # type of 'key' should be inferred to be 'str'
          ...
    

    Before this change only the first if would narrow the type of key.

    I've just added a test under test-data/unit/pythoneval.test as
    test-data/unit/fixtures/dict.pyi doesn't include dict.keys, and
    adding it requires importing dict_keys from _collections_abc in
    those stubs, which then requires adding _collections_abc.pyi stubs,
    which would have to be complete since e.g.
    testGenericAliasCollectionsABCReveal expects most of the types in
    those stubs to be defined (this is the same approach as
    7678f28).

    GH: issue No type narrowing for in on dict.keys() #13360

Test Plan

Test added under test-data/unit/check-isinstance.test

@@ -30,6 +31,7 @@ class dict(Mapping[KT, VT]):
@overload
def get(self, k: KT, default: Union[KT, T]) -> Union[VT, T]: pass
def __len__(self) -> int: ...
def keys(self) -> dict_keys[KT, VT]: ...
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was just the easiest way to get things working, happy to hear alternatives if there's some other approach I'm not aware of

@@ -2065,6 +2065,18 @@ def f() -> None:
[builtins fixtures/dict.pyi]
[out]

[case testNarrowTypeAfterInDictKeys]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just added this here because there was a similar test just above for TypedDicts

def f() -> None:
d: Dict[str, str] = {}
key: Optional[str]
if key not in d.keys():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's change this test a bit: we need to make sure that if key in d.keys() narrows the type correctly.

Placing reveal_type inside if is the easiest way to do that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing not in is appreciated as well!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's change this test a bit: we need to make sure that if key in d.keys() narrows the type correctly.

Placing reveal_type inside if is the easiest way to do that.
Testing not in is appreciated as well!

Updated b5050c0

@@ -6285,6 +6285,7 @@ def builtin_item_type(tp: Type) -> Optional[Type]:
"builtins.dict",
"builtins.set",
"builtins.frozenset",
"_collections_abc.dict_keys",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about typing.KeysView? https://github.com/python/cpython/blob/70fc9641b56144854777aef29c145cd10789e3df/Lib/typing.py#L2701

And collections.abc.KeysView?
What about ValuesView and ItemsView?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And dict_values as well as dict_items 🙂

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about typing.KeysView?

Sounds good. I wasn't sure exactly what would be best to be included in this changeset. Let me drag all of those in too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@github-actions

This comment has been minimized.

1 similar comment
@github-actions

This comment has been minimized.

@@ -1,5 +1,6 @@
# Builtins stub used in dictionary-related test cases.

from _collections_abc import dict_keys
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line looks to be the cause of I think all of the test failures, e.g.

$ pytest -q mypy/test/testcheck.py::TypeCheckSuite::check-incremental.test::testIncrementalWithDifferentKindsOfNestedTypesWithinMethod 
F                                                                                                                                                       [100%]
========================================================================== FAILURES ===========================================================================
_________________________________________________ testIncrementalWithDifferentKindsOfNestedTypesWithinMethod __________________________________________________
data: /home/mjh/src/mypy/test-data/unit/check-incremental.test:5662:
/home/mjh/src/mypy/mypy/test/testcheck.py:80: in run_case
    self.run_case_once(testcase, ops, step)
/home/mjh/src/mypy/mypy/test/testcheck.py:174: in run_case_once
    assert_string_arrays_equal(output, a, msg.format(testcase.file, testcase.line))
E   AssertionError: Unexpected type checker output in incremental, run 2 (/home/mjh/src/mypy/test-data/unit/check-incremental.test, line 5662)
-------------------------------------------------------------------- Captured stderr call ---------------------------------------------------------------------
Expected:
  tmp/a.py:2: error: "object" has no attribute "xyz" (diff)
Actual:
  tmp/a.py:2: error: Module has no attribute "xyz" (diff)

Alignment of first line difference:
  E: tmp/a.py:2: error: "object" has no attribute "xyz"
  A: tmp/a.py:2: error: Module has no attribute "xyz"
                        ^
=================================================================== short test summary info ===================================================================
FAILED mypy/test/testcheck.py::TypeCheckSuite::check-incremental.test::testIncrementalWithDifferentKindsOfNestedTypesWithinMethod
1 failed in 0.33s

I'm going to have a look around, but would be happy to hear any suggestions 🙃

Copy link
Member

@JelleZijlstra JelleZijlstra Aug 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably the stubs used for these tests don't include stubs for _collections_abc. As I recall there are stubs for various important standard library modules somewhere in the fixtures directory.

This may be difficult to get right as we have had some instability around the type of dict.keys() before; what you're using here (_collections_abc.dict_keys) is arguably a private implementation detail.

Copy link
Contributor Author

@matthewhughes934 matthewhughes934 Aug 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be difficult to get right as we have had some instability around the type of dict.keys() before; what you're using here (_collections_abc.dict_keys) is arguably a private implementation detail.

👍 I'm also not 100% on the approach, so happy to hear any alternatives you might have. So far I've just been following typeshed https://github.com/python/typeshed/blob/a92da58328aa259184ff45bb7408e82426fb563e/stdlib/builtins.pyi#L4, https://github.com/python/typeshed/blob/a92da58328aa259184ff45bb7408e82426fb563e/stdlib/builtins.pyi#L1039

@matthewhughes934
Copy link
Contributor Author

matthewhughes934 commented Aug 13, 2022

With 5fd5a93 I've moved the tests under test-data/unit/pythoneval.test just to verify they would work under a full set of stubs.

I've (very roughly) documented some of the issues I ran into while trying to add stubs for these into the test libraries, but I think I'd need to add some complete (at least sufficient to satisfy the tests in mypy/test/testcheck.py::TypeCheckSuite::check-generic-alias.test::testGenericAliasCollectionsABCReveal) stubs for _collections_abc in order to have everything under the test stubs.

I assume having test stubs for dict.keys would be the preference? Happy to give it a shot but I might need to be pointed in the right direction 🙂

EDIT: aaand I forgot to commit some changes, see 68b2881

@github-actions

This comment has been minimized.

@matthewhughes934 matthewhughes934 force-pushed the add-type-inference-from-dict-keys branch from 5fd5a93 to 68b2881 Compare August 13, 2022 21:22
@github-actions

This comment has been minimized.

@matthewhughes934 matthewhughes934 force-pushed the add-type-inference-from-dict-keys branch from 68b2881 to 01efd35 Compare August 16, 2022 17:44
@matthewhughes934
Copy link
Contributor Author

matthewhughes934 commented Aug 16, 2022

01efd35 contains the work for _collections_abc.dict_keys and typing.KeysView.

For dict.items I think I'll need to look a bit deeper, since collections.ItemsView inherits AbstractSet[tuple[_KT_co, _VT_co]] but I don't think mypy knows how to handle narrowing when tuple unpacking? Running mypy over the following on `master:

from typing import Dict, Optional, Set, Tuple

d: Dict[str, int]
s: Set[Tuple[str, int]]

v: Optional[int]
k: Optional[str]
p: Optional[Tuple[str, int]]

if (k, v) in d.items():
    reveal_type(k)
    reveal_type(v)

if (k, v) in s:
    reveal_type(k)
    reveal_type(v)

if p in s:
    reveal_type(p)

gives:

script.py:11: note: Revealed type is "Union[builtins.str, None]"
script.py:12: note: Revealed type is "Union[builtins.int, None]"
script.py:15: note: Revealed type is "Union[builtins.str, None]"
script.py:16: note: Revealed type is "Union[builtins.int, None]"
script.py:19: note: Revealed type is "Tuple[builtins.str, builtins.int]"

Where I'd expect none of those to be optional.

@github-actions

This comment has been minimized.

Also for containership checks on `typing.KeysView.` This is to bring the
following cases into alignment:

    from typing import KeysView

    key: str | None
    d: dict[str, int]
    kv: KeysView[str]

    if key in d:
        # type of 'key' is inferred to be 'str'
        ...

    if key in d.keys():
        # type of 'key' should be inferred to be 'str'
        ...

    if key in kv:
        # type of 'key' should be inferred to be 'str'
        ...

Before this change only the first `if` would narrow the type of `key`.

I've just added a test under `test-data/unit/pythoneval.test` as
`test-data/unit/fixtures/dict.pyi` doesn't include `dict.keys`, and
adding it requires importing `dict_keys` from `_collections_abc` in
those stubs, which then requires adding `_collections_abc.pyi` stubs,
which would have to be complete since e.g.
`testGenericAliasCollectionsABCReveal` expects most of the types in
those stubs to be defined (this is the same approach as
7678f28).

GH: issue python#13360
@matthewhughes934 matthewhughes934 force-pushed the add-type-inference-from-dict-keys branch from 01efd35 to 75ca813 Compare August 23, 2022 16:05
@matthewhughes934
Copy link
Contributor Author

@sobolevn any objections to keeping this changeset just for dict.keys? I'm thinking it might be worth having a separate discussion for .items and .values (see my comment above)

@JelleZijlstra
Copy link
Member

I'd be OK with supporting only .keys(); narrowing on .items() and .values() doesn't make nearly as much sense to me. I don't see any mention of those in the issue either.

@sobolevn
Copy link
Member

+1 to @JelleZijlstra 🙂

@github-actions
Copy link
Contributor

Diff from mypy_primer, showing the effect of this PR on open source code:

spark (https://github.com/apache/spark)
+ python/pyspark/sql/types.py:401: error: Redundant cast to "int"  [redundant-cast]
+ python/pyspark/sql/types.py:402: error: Redundant cast to "int"  [redundant-cast]

@matthewhughes934 matthewhughes934 requested review from JelleZijlstra and sobolevn and removed request for JelleZijlstra and sobolevn August 24, 2022 19:46
Copy link
Collaborator

@hauntsaninja hauntsaninja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@hauntsaninja hauntsaninja merged commit 507fc5c into python:master Oct 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants