-
Notifications
You must be signed in to change notification settings - Fork 1.6k
[ty] Argument type expansion for overload call evaluation #18382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
This comment was marked as resolved.
This comment was marked as resolved.
| /// Represents the state of the expansion process. | ||
| /// | ||
| /// This is useful to avoid cloning the initial types vector if none of the types can be | ||
| /// expanded. | ||
| enum State<'a, 'db> { | ||
| Initial(&'a Vec<Type<'db>>), | ||
| Expanded(Vec<Vec<Type<'db>>>), | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This mainly an optimization to avoid cloning the self.types vector to initialize the std::iter::successors iterator. This would be useful in cases where no argument types can be expanded.
| Either::Right(std::iter::once(element)) | ||
| } | ||
| }) | ||
| .multi_cartesian_product() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can avoid using itertools here and have a custom implementation for this but it would also require allocation so I thought that we might as well utilize an existing implementation.
| if return_types.len() == expanded_argument_lists.len() { | ||
| // If the number of return types is equal to the number of expanded argument lists, | ||
| // they all evaluated successfully. So, we need to combine their return types by | ||
| // union to determine the final return type. | ||
| // | ||
| // TODO: What should be the state of the bindings at this point? | ||
| self.return_type = Some(UnionType::from_elements(db, return_types)); | ||
| return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to spend some time thinking about this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this will require some kind of "merge" logic to make sure the bindings state is correct at the end of a successful argument lists evaluation.
Consider the following abstract example which contains 4 overloads:
@overload
def f(...) -> A: ...
@overload
def f(...) -> B: ...
@overload
def f(...) -> C: ...
@overload
def f(...) -> D: ...After expanding the first argument, we get two argument lists which matches overload 1 and 3 respectively. This means that we now have the winning overload matches (1 and 3), so the return type would be A | C.
The logic in this PR is such that it restores the state of every binding after evaluating a single argument list, so at the end of evaluating the two argument lists (as stated above), the state would be as if we didn't perform type checking at all. I think this is incorrect.
In Binding, there are 6 fields which we need to consider that gets updated during type checking which is what the BindingSnapshot is recording:
struct BindingSnapshot<'db> {
return_ty: Type<'db>,
specialization: Option<Specialization<'db>>,
inherited_specialization: Option<Specialization<'db>>,
argument_parameters: Box<[Option<usize>]>,
parameter_tys: Box<[Option<Type<'db>>]>,
errors_position: usize,
}Here, it's the errors field that represents whether an overload was "matched" or not while other fields represent additional information that type checking revealed.
Coming back to the above abstract example:
- For overload 1, the first argument list evaluated successfully
- For overload 3, the second argument list evaluated successfully
- For overload 2 and 4, none of the argument lists matched
For (1) and (2), the errors field for the Binding corresponding to the matched overload should be empty and other fields should have the information from the successful evaluation.
For (3), I'm not exactly sure what would be the correct approach. My thinking is that:
- Combine all the
errorsfrom evaluating both argument lists - Choose the
return_tyfrom the last argument list - Combine (?) the
specializationandinherited_specialization - Choose the
argument_parametersfrom the last argument list - Choose the
parameter_tysfrom the last argument list
Quickly looking at what Pyright does, it seems that it re-evaluates it using the first argument list: https://github.com/microsoft/pyright/blob/321b6bf687c6c3ffa3eb627aeb8a143bc4740cde/packages/pyright-internal/src/analyzer/typeEvaluator.ts#L9340-L9370
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For (3), the updated merged state would be something like the following:
initial arglist 1 arglist 2 final
-------- --------- --------- -----
1 unmatched matched unmatched matched
2 unmatched unmatched matched matched
3 unmatched unmatched unmatched unmatched
The "initial" is the state of each overloads before performing argument type expansion i.e., all three overloads are unmatched.
After evaluating the first argument list, it resulted in overload 1 being matched. Similarly, after evaluating the second argument list, it resulted in overload 2 being matched.
So, the final state of the bindings would indicate the overload 1 and 2 are matched while overload 3 is unmatched. This means that information about parameter types and specializations for the matched overload should be from the argument list that resulted in the match.
And, the return type would be the union of the "matched" overloads. This is a successful evaluation because both argument list resulted in a single matched overload.
| let pre_evaluation_snapshot = snapshot(self); | ||
|
|
||
| // Step 2: Evaluate each remaining overload as a regular (non-overloaded) call to determine | ||
| // whether it is compatible with the supplied argument list. | ||
| for (_, overload) in self.matching_overloads_mut() { | ||
| overload.check_types(db, argument_types.as_ref(), argument_types.types()); | ||
| } | ||
|
|
||
| match self.matching_overload_index() { | ||
| MatchingOverloadIndex::None => { | ||
| // If all overloads result in errors, proceed to step 3. | ||
| } | ||
| MatchingOverloadIndex::Single(_) => { | ||
| // If only one overload evaluates without error, it is the winning match. | ||
| return; | ||
| } | ||
| MatchingOverloadIndex::Multiple(_) => { | ||
| // If two or more candidate overloads remain, proceed to step 4. | ||
| // TODO: Step 4 and Step 5 goes here... | ||
| // We're returning here because this shouldn't lead to argument type expansion. | ||
| return; | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a possibility that this can be merged into a single loop that would also consider the expanded argument lists but for now I decided to just duplicate this part as it isn't too complex.
|
I think there are few more cases to test and I also need to think a bit more around the bindings state at the end of a successful evaluation of an expanded argument lists but I'm marking this as ready for review to get some initial feedback, specifically on the structure of the implementation and if there's something obvious that I've missed or mis-interpreted based on my reading of the spec. |
dcreager
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stepping out to pick up my son from school, here are some initial comments
| /// contains the same arguments, but with one or more of the argument types expanded. | ||
| /// | ||
| /// [argument type expansion]: https://typing.python.org/en/latest/spec/overload.html#argument-type-expansion | ||
| pub(crate) fn expand(&self, db: &'db dyn Db) -> impl Iterator<Item = Vec<Vec<Type<'db>>>> + '_ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this implementation!
carljm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just glanced quickly at the tests for semantics; looks like @dcreager already has it covered for detailed review of the code. This is fantastic, thank you!
a44db26 to
fc725c2
Compare
| // TODO: Evaluate it as a regular (non-overloaded) call. This means that any | ||
| // diagnostics reported in this check should be reported directly instead of | ||
| // reporting it as `no-matching-overload`. | ||
| self.overloads[index].check_types( | ||
| db, | ||
| argument_types.as_ref(), | ||
| argument_types.types(), | ||
| ); | ||
| return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This TODO is important because we currently give no-matching-overload which is incorrect because we did match an overload, it's type checking that failed.
|
I've put this in draft because I need to make some small tweaks with respect to the final state of the bindings that I've realized after talking with Carl in our 1:1 now. I'll quickly make those changes and will ask for review. Apologies if anyone was already looking at this PR. |
5ac3b38 to
5d2614c
Compare
I'm heading into an interview but will try to take a look at this later today |
carljm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
dcreager
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one last nit otherwise this looks great!
* main: [ty] Add tests for empty list/tuple unpacking (astral-sh#18451) [ty] Argument type expansion for overload call evaluation (astral-sh#18382) [ty] Minor cleanup for `site-packages` discovery logic (astral-sh#18446) [ty] Add generic inference for dataclasses (astral-sh#18443) [ty] dataclasses: Allow using dataclasses.dataclass as a function. (astral-sh#18440) [ty] Create separate `FunctionLiteral` and `FunctionType` types (astral-sh#18360) [ty] Infer `list[T]` when unpacking non-tuple type (astral-sh#18438) [ty] Meta-type of type variables should be type[..] (astral-sh#18439) [`pyupgrade`] Make fix unsafe if it deletes comments (`UP050`) (astral-sh#18390) [`pyupgrade`] Make fix unsafe if it deletes comments (`UP004`) (astral-sh#18393) [ty] Support using legacy typing aliases for generic classes in type annotations (astral-sh#18404) Use ty's completions in playground (astral-sh#18425) Update editor setup docs about Neovim and Vim (astral-sh#18324) Update NPM Development dependencies (astral-sh#18423)
|
seems like this didn’t suffice to make the test case above work: https://play.ty.dev/093e1321-5e0f-4b1c-b46e-26745c902c1d
maybe another part of overload evaluation is responsible here. Should I file an issue? |
|
I'll need to look at your example a bit closely to see what you're trying to do but you're correct that this step of the algorithm wouldn't have solved it. Apologies for not noticing it earlier as this step is performing argument type expansion and in your example overload, the argument type is just |
|
The issue in my example is that ty knows that a class instance is iterable if it has a |
(I think you mean |
|
Here’s a (failing) mdtest case: ### Filling niches
`overloaded.pyi`:
```pyi
from typing import Literal, overload
from ty_extensions import Not, Intersection
class A: ...
class B: ...
@overload
def f(x: Literal[0]) -> A: ...
@overload
def f(x: Intersection[int, Not[Literal[0]]]) -> B: ...
```
```py
from overloaded import A, B, f
def _(x: int):
reveal_type(f(0)) # revealed: A
reveal_type(f(1)) # revealed: B
reveal_type(f(x)) # revealed: A | B
``` |
|
@flying-sheep You're right that a type checker could expand I think we have higher priority things we need to do at the moment, though. It seems fairly easy to work around this issue: is there any downside to just adding an explicit third overload |
|
If I read steps 3 of the “overload call evaluation” part of the spec correctly, that behavior is wrong:
So since Or am I wrong? PyRight complains when overlapping overloads like this exist |
That's not how overload evaluation is specified to work in Python. The first matching overload is used, not the union of the return types of all matching overloads. (I personally think it would be better to treat overloads as an un-ordered intersection of signatures, and union the return types of all matched overloads, but we actually tried this in ty and the results from the ecosystem made it clear that it would be too incompatible with existing usage. Which isn't too surprising, given that the existing ecosystem doesn't have negation types available, and you really need negation types to make the "un-ordered intersection of signatures" interpretation usable. The ordered behavior effectively gives you implicit negation by having a more specific overload shadow a more general one.) |
|
OK, is there some explicit mention of that in the spec? The part I’m quoting seems to contradict that (at least in the presence of expandable unions), while Step 6 agrees with you. So PyRight (and maybe mypy) are just buggy? PS: my original motivation is “simply” being able to type unpacking a heterogeneous collection: https://play.ty.dev/9817830f-c7c1-4a44-b4be-91704fe9500d adding that overload will of course make that fail |
|
You need to consider the whole algorithm as presented in the spec, for any given case. Note that union expansion doesn't happen at all, unless zero overloads match after step 2.
In the case of passing
I'm not sure which behavior you are referring to as buggy here.
I don't think so? It works fine for direct indexing: https://play.ty.dev/01df94d6-0b7e-4df9-9d85-f7d4dbc92224 It doesn't work for iteration/unpacking because we simply model a call to |
|
Thanks for walking me through the algorithm, I wasn’t sure if I missed something!
yeah, which is why I specified that what I care about here is the unpacking lol
See https://microsoft.github.io/pyright/#/configuration?id=type-check-rule-overrides
You’re saying that the spec says that overlapping overloads are an intentional part of the spec. PyRight flags them as error unless you ignore that. Should I report a bug? |
No; the second part "have incompatible return types" is critical there. The spec says that overlapping overloads are fine, as long as the later overload has a return type that is a super-type of the earlier overload. (This is what's necessary to make the overlapping overloads sound in the face of uncertain precision of the argument type.) That's the same logic that pyright's warning implements. |
|
OK, so we’re at square 1: Currently, the only way to have a chance at correctly typing the unpacking of a heterogeneous collection is to
right? |
|
I don't understand why (1) is required, or how the extra overload causes a problem. (It meets the requirement I mentioned that its return type is a super type of the prior overlapping overload return types.) My link above adds the extra |
|
Yeah, you’re right! That instead of “it’s an iterable over |
Summary
Part of astral-sh/ty#104, closes: astral-sh/ty#468
This PR implements the argument type expansion which is step 3 of the overload call evaluation algorithm.
Specifically, this step needs to be taken if type checking resolves to no matching overload and there are argument types that can be expanded.
Test Plan
Add new test cases.
Ecosystem analysis
This PR removes 174
no-matching-overloadfalse positives -- I looked at a lot of them and they all are false positives.One thing that I'm not able to understand is that in https://github.com/sphinx-doc/sphinx/blob/2b7e3adf27c158305acca9b5e4d0d93d3e4c6f09/sphinx/ext/autodoc/preserve_defaults.py#L179 the inferred type of
valueisstr | Noneby ty and Pyright, which is correct, but it's only ty that raisesinvalid-argument-typeerror while Pyright doesn't. The constructor method ofDefaultValuehas declared type ofstrwhich is invalid.There are few cases of false positives resulting due to the fact that ty doesn't implement narrowing on attribute expressions.