Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix/#13070 defer annotations when future is active #13395

Conversation

Slyces
Copy link
Contributor

@Slyces Slyces commented Sep 18, 2024

Summary

This PR tries to provide support for deferred resolution of type-hints in files with from __future__ import annotations. Fixes #13070.

Implementation

Currently, the method is_stub is already used in multiple places to resolve deferred annotations. As all current usage of is_stubs are dedicated to deferred type inference, I decided in this implementation to refactor is_stubs && has_future_annotations together in are_all_types_deferred.
We can also keep them separate and always test them together.

The main difficulty is how to know if that flag is active. I tried an implementation that checks this when building the semantic index. I do not know if it's the best choice (as SemanticIndex doesn't have any attribute similar to this), but the building of the semantic index already parses every statement in the file, which was ideal to find this flag.

Happy to get any feedback on a better spot for this if you can think of one.

Test Plan

There is currently one test addressing deferred annotations resolution, focused on builtins. I don't quite understand why builtins require deferred annotations, so I focused on test cases I encounter when coding in python.

In my experience, the two most common occurrences requiring deferred annotations (in regular code) are:

  • Referencing a symbol before it's defined
  • Referencing a class inside one of its own methods

Some testing led me to find that we don't currently support type resolution for methods, so I only tested the case of referencing a symbol before its definition.

I added 3 separate test case, all with the same base code:

def get_foo() -> Foo: ...
class Foo: ...
foo = get_foo()  # Resolved if and only if deferred annotations are active
  • Inside a source file (*.py, foo must not resolve (Unknown)
  • Inside a stub file (*.pyi), foo must resolve to Foo
  • Inside a source file with future.annotations, foo must resolve to Foo

@AlexWaygood AlexWaygood added the red-knot Multi-file analysis & type inference label Sep 18, 2024
Copy link

codspeed-hq bot commented Sep 18, 2024

CodSpeed Performance Report

Merging #13395 will degrade performances by 4.26%

Comparing Slyces:fix/#13070-defer-annotations-when-future-is-active (301d065) with main (d3530ab)

Summary

❌ 1 regressions
✅ 31 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Benchmark main Slyces:fix/#13070-defer-annotations-when-future-is-active Change
red_knot_check_file[incremental] 2.9 ms 3 ms -4.26%

Copy link
Contributor

github-actions bot commented Sep 18, 2024

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

@Slyces
Copy link
Contributor Author

Slyces commented Sep 18, 2024

I just realised that the performance issue can probably be fixed by relying on the fact that from __future__ import annotations should always be the first import in a file - I'll try to get that to work.

Copy link
Contributor

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fantastic, thank you!!

The semantic index builder is the right place to look for this, and the semantic index is the right place to put it. Any analysis that we can do without requiring a dependency on other files, we want to do in semantic indexing.

I suspect the performance regression on the incremental check is because tomllib (which we are using for our benchmark) does use from __future__ import annotations, and deferred types require an extra Salsa query and thus increase Salsa overhead on incremental check.

So I suspect that requiring from __future__ import annotations to be the first statement in a file won't help with the performance.

And I'm not sure if we even want to require it. A mis-placed from __future__ import annotations won't work at runtime (the module won't even import), but for our purposes I think we should assume the user intended for it to take effect, and we should still check the code assuming its use. Otherwise accidentally adding a line before your from __future__ import annotations could suddenly result in a ton of bogus diagnostics on your forward-reference annotations, which isn't really useful.

I think we should add a comment to this effect, where we check for from __future__ import annotations in the semantic index builder.

And I think we should emit a diagnostic in type inference if we run across a misplaced __future__ import. But this is pretty much a separate thing and could be its own PR. (EDIT: I actually think probably it shouldn't be done in type inference; misplaced __future__ imports are a syntax-level error that isn't really part of type-checking. So we'll want this diagnostic, but probably not in type inference, and definitely not in this PR.)

crates/red_knot_python_semantic/src/types/infer.rs Outdated Show resolved Hide resolved
@Slyces
Copy link
Contributor Author

Slyces commented Sep 18, 2024

So I suspect that requiring from future import annotations to be the first statement in a file won't help with the performance.

That's somewhat of a relief (if it's true) because the implementation I have so far is not ideal. I will still try to go through with it to make sure that it is not the source of the regression

@carljm
Copy link
Contributor

carljm commented Sep 18, 2024

Some testing led me to find that we don't currently support type resolution for methods

This is true, though the spot in the code you linked is for supporting attribute access of attributes on function objects. To support method calls, I think what we are missing is understanding attribute access on instances of classes (as opposed to class objects):

// TODO MRO? get_own_instance_member, get_instance_member

@carljm
Copy link
Contributor

carljm commented Sep 18, 2024

I will still try to go through with it to make sure that it is not the source of the regression

If this is a lot of work, an easier way to check the performance impact would be to see how performance on the benchmark fares if you just consider from __future__ import annotations to always be true, without even looking for it in semantic index.

@Slyces
Copy link
Contributor Author

Slyces commented Sep 18, 2024

While this is not related to the PR, I did waste some time on the difference between "expected Unknown" (e.g. resolution failed) vs. "Not implemented Unknown" (e.g. this case should yield a value for the current AST but we didn't code it yet).

Would it be worth considering a temporary value (until red-knot is feature complete or almost so) that would make this difference obvious? To be fair I'll also be more careful next time now that I know 🙂

@carljm
Copy link
Contributor

carljm commented Sep 18, 2024

Would it be worth considering a temporary value (until red-knot is feature complete or almost so) that would make this difference obvious?

That's an interesting idea! I think we could have Type::Unknown contain another enum indicating the source of the unknown (mypy does something similar, @AlexWaygood has mentioned this before). Then most code handling Unknowns wouldn't have to care (we wouldn't have to add a new branch in every place that handles types), but we could display them differently.

This seems like a net positive to me! Not sure I would get to it soon, but if you're interested in doing it, I'd certainly consider a PR.

@AlexWaygood
Copy link
Member

AlexWaygood commented Sep 18, 2024

That's an interesting idea! I think we could have Type::Unknown contain another enum indicating the source of the unknown (mypy does something similar, @AlexWaygood has mentioned this before). Then most code handling Unknowns wouldn't have to care (we wouldn't have to add a new branch in every place that handles types), but we could display them differently.

Yes, see #12986 for previous discussion! I think the main concern with that PR was how to handle the inner enum in unions and intersections.

Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great. Thanks you

@MichaReiser
Copy link
Member

It's a bit a problem that I can't acknowledge the codspeed regression because the URl doesn't work haha

Okay, there's a way. Navigate to the runs page and open the results from there https://codspeed.io/astral-sh/ruff/runs/66ebdbcf6ba711013f9a14a5

@MichaReiser MichaReiser merged commit a8d9104 into astral-sh:main Sep 19, 2024
19 of 20 checks passed
@Slyces Slyces deleted the fix/#13070-defer-annotations-when-future-is-active branch September 19, 2024 08:13
@Slyces Slyces mentioned this pull request Sep 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
red-knot Multi-file analysis & type inference
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[red-knot] defer annotation resolution when __future__.annotations is active
4 participants