-
Notifications
You must be signed in to change notification settings - Fork 1.6k
[ty] Type inference for comprehensions #20962
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Diagnostic diff on typing conformance testsChanges were detected when running ty on typing conformance tests--- old-output.txt 2025-11-02 13:20:06.790597551 +0000
+++ new-output.txt 2025-11-02 13:20:09.990614823 +0000
@@ -36,7 +36,7 @@
aliases_implicit.py:72:5: error[type-assertion-failure] Argument does not have asserted type `(...) -> None`
aliases_implicit.py:107:9: error[invalid-type-form] Variable of type `list[Unknown | <class 'int'> | <class 'str'>]` is not allowed in a type expression
aliases_implicit.py:108:9: error[invalid-type-form] Variable of type `tuple[tuple[<class 'int'>, <class 'str'>]]` is not allowed in a type expression
-aliases_implicit.py:109:9: error[invalid-type-form] Variable of type `list[@Todo(list comprehension element type)]` is not allowed in a type expression
+aliases_implicit.py:109:9: error[invalid-type-form] Variable of type `list[<class 'int'> | Unknown]` is not allowed in a type expression
aliases_implicit.py:110:9: error[invalid-type-form] Variable of type `dict[Unknown | str, Unknown | str]` is not allowed in a type expression
aliases_implicit.py:114:9: error[invalid-type-form] Variable of type `Literal[3]` is not allowed in a type expression
aliases_implicit.py:115:10: error[invalid-type-form] Variable of type `Literal[True]` is not allowed in a type expression |
|
| Lint rule | Added | Removed | Changed |
|---|---|---|---|
invalid-argument-type |
91 | 0 | 89 |
possibly-missing-attribute |
80 | 1 | 36 |
unsupported-operator |
10 | 0 | 39 |
invalid-return-type |
28 | 0 | 11 |
unused-ignore-comment |
0 | 35 | 0 |
invalid-assignment |
10 | 0 | 20 |
not-iterable |
3 | 0 | 11 |
index-out-of-bounds |
13 | 0 | 0 |
non-subscriptable |
3 | 0 | 6 |
unresolved-attribute |
6 | 0 | 2 |
no-matching-overload |
4 | 0 | 0 |
possibly-missing-implicit-call |
1 | 0 | 3 |
type-assertion-failure |
0 | 2 | 0 |
call-non-callable |
1 | 0 | 0 |
missing-argument |
1 | 0 | 0 |
| Total | 251 | 38 | 217 |
|
f6e103d to
18ee11c
Compare
18ee11c to
92cd27d
Compare
bde186e to
1fb076b
Compare
1fb076b to
c7f7e61
Compare
c7f7e61 to
df31e1c
Compare
|
|
||
| # TODO: no error here | ||
| # error: [invalid-assignment] | ||
| table_with_content: list[list[tuple[int, int, str | None]]] = [[(x, y, None) for x in range(3)] for y in range(3)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed with @ibraheemdev, solving this will require pushing down the type context into infer_scope_types in order to infer the None in (x, y, None) as str | None inside the comprehension. Similar for the TypedDict test below. This change is a bit more invasive, as it requires us to skip comprehension scopes in check_types, because we would not have the required type context available when simply looping over all scopes. This change would be fine because with this PR, we're now calling infer_scope_types for comprehension scopes when we check the outer scope. A similar change would probably be required for lambdas as well.
b1be8cb to
c61171a
Compare
AlexWaygood
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
|
Looks like the remaining fuzzer panics are seed 20, which minimize to this: name_5(name_3)
[0 for unique_name_0 in unique_name_1 for unique_name_2 in name_3]
@{name_3 for unique_name_3 in unique_name_4}
class name_4[**name_3](0, name_2=name_5):
pass
try:
name_0 = name_4
except* 0:
pass
else:
match unique_name_12:
case 0:
from name_2 import name_3
case name_0():
@name_4
def name_3():
pass
(name_3 := 0)
@name_3
async def name_5():
passwith this stacktrace: and seed 486, which minimizes to this: for name_1 in {{{0: name_4 for unique_name_0 in unique_name_1}: 0 for unique_name_2 in unique_name_3 if name_4}: 0 for unique_name_4 in name_1 for name_4 in name_1}:
passwith this stacktrace: ISTM that neither should block this being merged, but we could possibly add expect-panicking mdtests so we get notified when they get fixed. |
In #20937, we reduced this from 20 to 5, so that it would become obvious if ty was now taking a pathological amount of time to type-check a given seed. 5 minutes is easily long enough if there are no new bugs, but it turns out to be too low if there _are_ new bugs, because on encountering a new bug the fuzzer must repeatedly run ty on a smaller and smaller snippet in order to try to minimize the reproducible example. That means that the fuzzer job is timing out on #20962, not because of pathological performance issues but because we're not giving the script enough time to minimize the bugs that it's found. Hopefully 10 should be a good compromise where it still becomes obvious if we introduce pathological performance issues on certain fuzzer seeds, but we give the fuzzer enough time to minimize examples when it finds new bugs. A better solution to find new pathological performance issues on fuzzer seeds might be to just compare the execution time between the baseline-executable run on a given seed and the test-executable run on the same seed. This is just a quick-and-easy fix for now.
|
Yes, thank you. I was also looking into these earlier today and came to the same conclusion. Will panics in the baseline |
The costly minimization only kicks in for the new panics that don't exist in the baseline executable -- this should be fine! |
|
Okay, in CI it does seem like this makes us a fair bit slower to execute on seed 742 (a seed which we're not panicking on). But I assume that this is another instance where our performance is much slower on a debug build than on a release build; the fuzzer script still completes in a very reasonable time locally when I use a release build. If it feels like the fuzzer CI job is now taking an annoying amount of time, we could experiment with using |
## Summary cf. #20962 In the following code, `foo` in the comprehension was not reported as unresolved: ```python # error: [unresolved-reference] "Name `foo` used when not defined" foo foo = [ # no error! # revealed: Divergent reveal_type(x) for _ in () for x in [foo] ] baz = [ # error: [unresolved-reference] "Name `baz` used when not defined" # revealed: Unknown reveal_type(x) for _ in () for x in [baz] ] ``` In fact, this is a more serious bug than it looks: for `foo`, [`explicit_global_symbol` is called](https://github.com/astral-sh/ruff/blob/6cc3393ccd9059439d9c1325e0e041db1d7481af/crates/ty_python_semantic/src/types/infer/builder.rs#L8052), causing a symbol that should actually be `Undefined` to be reported as being of type `Divergent`. This PR fixes this bug. As a result, the code in `mdtest/regression/pr_20962_comprehension_panics.md` no longer panics. ## Test Plan `corpus\cyclic_symbol_in_comprehension.py` is added. New tests are added in `mdtest/comprehensions/basic.md`. --------- Co-authored-by: Micha Reiser <micha@reiser.io> Co-authored-by: Carl Meyer <carl@astral.sh>
Summary
Adds type inference for list/dict/set comprehensions, including bidirectional inference:
Ecosystem impact
I did spot check the changes and most of them seem like known limitations or true positives. Without proper bidirectional inference, we saw a lot of false positives.
Test Plan
New Markdown tests