-
-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Support type inference for defaultdict() #8167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This allows inferring type of `x`, for example: ``` from collections import defaultdict x = defaultdict(list) # defaultdict[str, List[int]] x['foo'].append(1) ``` The implemention is not pretty and we have probably reached about the maximum reasonable level of special casing in type inference now. There is a hack to work around the problem with leaking type variable types in nested generics calls (I think). This will break some (likely very rare) use cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! This is the last big missing piece for type inference of empty collections. I have some comments, also why do you need the typeshed change?
mypy/checker.py
Outdated
arg1 = get_proper_type(init_type.args[1]) | ||
if (isinstance(arg0, (NoneType, UninhabitedType)) and | ||
isinstance(arg1, Instance) and | ||
self.is_valid_ordereddict_partial_value_type(arg1)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This name is really weird:
- Why
ordereddict
and notdefaultdict
? - I don't think it should be called
valid
, maybeempty
?
For example is_empty_defaultdict_value_type()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, I'll rename it.
arg0 = get_proper_type(init_type.args[0]) | ||
arg1 = get_proper_type(init_type.args[1]) | ||
if (isinstance(arg0, (NoneType, UninhabitedType)) and | ||
isinstance(arg1, Instance) and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be cleaner to move this check to the below helper.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. It's here since we rely on the narrowed down type below, so we'd need a type check anyway.
mypy/checker.py
Outdated
@@ -2813,14 +2813,24 @@ def infer_partial_type(self, name: Var, lvalue: Lvalue, init_type: Type) -> bool | |||
partial_type = PartialType(None, name) | |||
elif isinstance(init_type, Instance): | |||
fullname = init_type.type.fullname | |||
if (isinstance(lvalue, (NameExpr, MemberExpr)) and | |||
is_ref = isinstance(lvalue, (NameExpr, MemberExpr)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While you are at it maybe change this to RefExpr
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense.
mypy/checker.py
Outdated
@@ -2829,6 +2839,19 @@ def infer_partial_type(self, name: Var, lvalue: Lvalue, init_type: Type) -> bool | |||
self.partial_types[-1].map[name] = lvalue | |||
return True | |||
|
|||
def is_valid_ordereddict_partial_value_type(self, t: Instance) -> bool: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a short docstring with an example (also the name needs fixing, see above).
mypy/checker.py
Outdated
# they leak in cases like defaultdict(list) due to a bug. | ||
# This can result in incorrect types being inferred, but only | ||
# in rare cases. | ||
if isinstance(arg, (TypeVarType, UninhabitedType)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this also include NoneType
? Otherwise it may not work if strict_optional=False
(see the check for key type). Add a test for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add a test.
mypy/types.py
Outdated
# the type argument is not final and will be replaced later (it's TypeVarType | ||
# or UninhabitedType). | ||
# | ||
# TODO: The type argument can only be TypeVarType due to leaking type variables |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we just store an erased instance here? Then we can remove this TODO and only keep the one in checker.py
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A good point -- I'll try this out.
@@ -26,6 +26,9 @@ class ellipsis: pass | |||
# Primitive types are special in generated code. | |||
|
|||
class int: | |||
@overload | |||
def __init__(self) -> None: pass | |||
@overload |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the change to mypyc needed here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A test case relies on the int()
constructor since it uses defaultdict(int)
.
from collections import defaultdict | ||
x = defaultdict(int) | ||
x[''] = 1 | ||
reveal_type(x) # N: Revealed type is 'collections.defaultdict[builtins.str, builtins.int]' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a similar test case where value type is generic, e.g.:
x = defaultdict(list) # No error
x['a'] = [1, 2, 3]
and
x = defaultdict(list) # Error here
x['a'] = []
The typeshed change is accidental, I'll revert it. |
This allows inferring type of
x
, for example:The implemention is not pretty and we have probably
reached about the maximum reasonable level of special
casing in type inference now.
There is a hack to work around the problem with leaking
type variable types in nested generics calls (I think).
This will break some (likely very rare) use cases.