-
-
Notifications
You must be signed in to change notification settings - Fork 32.2k
gh-133956 fix bug where ClassVar
string annotation in @dataclass
caused incorrect __init__
generation
#134073
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -753,21 +753,33 @@ def _is_type(annotation, cls, a_module, a_type, is_type_predicate): | |
# that's defined. It was judged not worth it. | ||
|
||
match = _MODULE_IDENTIFIER_RE.match(annotation) | ||
if match: | ||
ns = None | ||
module_name = match.group(1) | ||
if not module_name: | ||
# No module name, assume the class's module did | ||
# "from dataclasses import InitVar". | ||
ns = sys.modules.get(cls.__module__).__dict__ | ||
else: | ||
# Look up module_name in the class's module. | ||
module = sys.modules.get(cls.__module__) | ||
if module and module.__dict__.get(module_name) is a_module: | ||
ns = sys.modules.get(a_type.__module__).__dict__ | ||
if ns and is_type_predicate(ns.get(match.group(2)), a_module): | ||
return True | ||
return False | ||
if not match: | ||
return False | ||
|
||
ns = None | ||
module_name = match.group(1) | ||
type_name = match.group(2) | ||
|
||
if not module_name: | ||
# No module name, assume the class's module did | ||
# "from dataclasses import InitVar". | ||
ns = sys.modules.get(cls.__module__).__dict__ | ||
else: | ||
# Look up module_name in the class's module. | ||
cls_module = sys.modules.get(cls.__module__) | ||
if not cls_module: | ||
return False | ||
Comment on lines
+770
to
+771
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It is interesting that we're paranoid about the dataclass not having its module imported on only this branch and not the branch above. Not suggesting a change, but a comment could be useful if you know why we only care in the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I was following the checks that were already present in the function. In the module = sys.modules.get(cls.__module__)
if module and module.__dict__.get(module_name) is a_module:
ns = sys.modules.get(a_type.__module__).__dict__ I’d actually consider removing that check — it does seem unnecessary. At least, I can’t think of a case where There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I saw the original code. My suggestion would be to hoist module = sys.modules.get(cls.__module__)
if module: # not sure that we need to be this paranoid
ns = module.__dict__
if module_name:
... |
||
|
||
a_type_module = cls_module.__dict__.get(module_name) | ||
if ( | ||
isinstance(a_type_module, types.ModuleType) | ||
# Handle cases when a_type is not defined in | ||
# the referenced module, e.g. 'dataclasses.ClassVar[int]' | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
and a_type_module.__dict__.get(type_name) is a_type | ||
): | ||
ns = sys.modules.get(a_type.__module__).__dict__ | ||
Comment on lines
+774
to
+780
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this could be replaced with: return (
isinstance(a_type_module, types.ModuleType)
and is_type_predicate(a_type_module.__dict__.get(type_name), a_module)) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It looks like we can indeed replace it. But if we go with this version, the Also, I was thinking — in the original version, this line: sys.modules.get(a_type.__module__) could probably be replaced with There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If you think the code can be simplified, please do so, then I'll review that version. |
||
|
||
return ns and is_type_predicate(ns.get(type_name), a_module) | ||
|
||
|
||
def _get_field(cls, a_name, a_type, default_kw_only): | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
# We need this to test a case when a type | ||
# is imported via some other package, | ||
# like ClassVar from typing_extensions instead of typing. | ||
# https://github.com/python/cpython/issues/133956 | ||
from typing import ClassVar | ||
from dataclasses import InitVar |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
#from __future__ import annotations | ||
USING_STRINGS = False | ||
|
||
# dataclass_module_3.py and dataclass_module_3_str.py are identical | ||
# except only the latter uses string annotations. | ||
|
||
from dataclasses import dataclass | ||
import test.test_dataclasses._types_proxy as tp | ||
|
||
T_CV2 = tp.ClassVar[int] | ||
T_CV3 = tp.ClassVar | ||
|
||
T_IV2 = tp.InitVar[int] | ||
T_IV3 = tp.InitVar | ||
|
||
@dataclass | ||
class CV: | ||
T_CV4 = tp.ClassVar | ||
cv0: tp.ClassVar[int] = 20 | ||
cv1: tp.ClassVar = 30 | ||
cv2: T_CV2 | ||
cv3: T_CV3 | ||
not_cv4: T_CV4 # When using string annotations, this field is not recognized as a ClassVar. | ||
|
||
@dataclass | ||
class IV: | ||
T_IV4 = tp.InitVar | ||
iv0: tp.InitVar[int] | ||
iv1: tp.InitVar | ||
iv2: T_IV2 | ||
iv3: T_IV3 | ||
not_iv4: T_IV4 # When using string annotations, this field is not recognized as an InitVar. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
from __future__ import annotations | ||
USING_STRINGS = True | ||
|
||
# dataclass_module_3.py and dataclass_module_2_str.py are identical | ||
# except only the latter uses string annotations. | ||
|
||
from dataclasses import dataclass | ||
import test.test_dataclasses._types_proxy as tp | ||
|
||
T_CV2 = tp.ClassVar[int] | ||
T_CV3 = tp.ClassVar | ||
|
||
T_IV2 = tp.InitVar[int] | ||
T_IV3 = tp.InitVar | ||
|
||
@dataclass | ||
class CV: | ||
T_CV4 = tp.ClassVar | ||
cv0: tp.ClassVar[int] = 20 | ||
cv1: tp.ClassVar = 30 | ||
cv2: T_CV2 | ||
cv3: T_CV3 | ||
not_cv4: T_CV4 # When using string annotations, this field is not recognized as a ClassVar. | ||
|
||
@dataclass | ||
class IV: | ||
T_IV4 = tp.InitVar | ||
iv0: tp.InitVar[int] | ||
iv1: tp.InitVar | ||
iv2: T_IV2 | ||
iv3: T_IV3 | ||
not_iv4: T_IV4 # When using string annotations, this field is not recognized as an InitVar. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
Fix bug where ``ClassVar`` string annotation in :func:`@dataclass <dataclasses.dataclass>` caused incorrect __init__ generation | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm just starting to look at this PR, and will have more to say later. For now: I'd prefer this to say something more specific, like |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is an edge case that is not handled by this:
foo.typing_filtered.ClassVar
. The regex only accounts for one module level. While it is probably uncommon to importfoo.typing_filtered
without aliasing it, it is possible.I don't want to change the world for a simple bug fix, but it seems like partitioning on
.
is more robust (and probably more performant) than using the current regex.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I agree that splitting on a dot would be faster, there are existing tests that cover cases like
dataclasses.InitVar.[int]
anddataclasses.InitVar+
. If we want to support multiple module levels, we should extend the current regex pattern rather than replace it.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's worth creating a new issue to track more levels of module nesting. And I think it's a mistake to test for
dataclasses.InitVar.[int]
and other invalid strings.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This bug was actually caught by
__.typx.ClassVar
in the real world. I typically stuff all of my common imports, includingtyping_extensions
astypx
, into an internals subpackage (__
) so that I can dofrom . import __
. Reduces module namespace pollution and lets me avoid setting__all__
to define module interfaces. But, I understand that my practice may not be common.That said, any fix to multi-level traversal here or another PR is going to affect the code that is actually being fixed. Imo, it would make more sense to fix both together holistically (to save developer effort), if there is any interest in actually fixing the multi-level traversal.