-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Intersection PEP #49
base: main
Are you sure you want to change the base?
Intersection PEP #49
Conversation
I've now updated to resolve the issues above - any comments please take a look. If there are no further comments, next week I'll read it over again, then open an issue on the discuss forum for a draft PEP. The main thing I think this might be missing is potentially some of the rejected ideas (as there were many!), but I've tried to summarise any major ones. As far as I know, the implementation and specification are complete in that it seems to resolve all outstanding issues I'm aware of - if there's anything I've missed here please let me know. Quite excited to think we may actually have a fully fledged specification - getting there! |
This approach has the same flaw with @ mikeshardmind had a simple formulation that still works with Any, but based on his most recent messages here, it seems like he's not going to advocate for the simpler option remaining. As I understand it, he'd rather intersections not be added until the underlying issues are resolved.
That's the entire definition of what can and can't be inferred from an intersection expressed in 2 short lines describing subtyping and supertyping relationships. He didn't like the implications of trying to combine this with a bad definition of Any and tried to work around that. It being independent of the definition of Any and having no special casing might mean we can ignore Any having a bad definition and just get something that works for most people. If it doesn't work for someone, they can pick up improving the definition of Any from the case where it doesn't work. |
@DiscordLiz I added this section: https://github.com/CarliJoy/intersection_examples/blob/intersection-pep-new/docs/specification.rst#any-subclassing-rules to try and resolve this. Broadly my thought it that by relaxing the inheritance restriction on class A:
foo: str = "foo"
class C(Any, A):
pass
D: Intersection[Any, A] There nothing to say that objects of type |
We were always considering an intersection as different to a specific class declaration. Simply saying we do this doesn't solve the problems that were raised around this. I don't know how to explain this any better, you're still conflating parts of the type system that should not be conflated after multiple explanations throughout the discussions, and it seems to be having an impact on what you are writing ending up missing the problem. Maybe someone else can explain better. What type checkers need are the subtyping rules. I have seen no indication that the extra complexity you have here solves any problems. It does creates new ones, treating Any as a structural type rather than Any means you need to define it's behavior as a structural type without breaking any more fundamental rule and having a decidable interface and rules for how type checkers address that in an intersection. You have not provided something that accomplishes that. The focus on splitting inheritance and structural types has you missing the more fundamental rule that both must abide by. Both of these imply rules for subtyping relationships. It would be better to focus on a consistent subtyping relationship that does not try to treat either of these differently. You can refer to my last post for the minimal relationships to define an intersection. I think any proposal that does not use these needs to show definitively why it behaves better. The prior attempt did not actually behave better than those, and was abandoned. You also retained a section on ordering, but intersections don't have to be ordered at all, that was a synthetic requirement of the previous attempt that we showed definitively doesn't solve the problem, it's unclear why you've retained this as none of the new language resolves those issues either. |
@DiscordLiz Apologies, I am trying to understand this - what parts of the type system am I conflating?
Broadly, I'm trying to define a set of rules that would be easy to implement for type checkers. I think the sets of rules you define here:
...are not in conflict with the specification, but I think they don't cover all situations. We've discussed before non-LSP violating method clashes that can occur, and these rules don't really show how to resolve that.
This is true - I'm really trying to break down what it means to be a subtype into rules that will be easier to implement, and resolve unspecified cases at the same time. Also, I think the statement about class A:
def foo(self):
pass
def test(x: T) -> Intersection[A, T]:
class New(type(x), A):
pass
return New()
class B:
def bar(self):
pass
b: Any = B()
x = test(b) # Intersection[Any, A] So here we have created an Any intersection, as let's say B was an unknown type. Based on the ruleset: Subtype of AA subtype of A, must inherit from A and have attribute foo. Subtype of AnyA subtype of Any, must inherit from Any and have all attributes as Any Now here we have an issue - B does not inherit from Any, and neither does x. My point is that subtypes of Any do not have the inheritance restriction, so it seems it's already been defined as a structural type. If there is a PEP somewhere I've missed that lays out what it means to be a subtype I'd be interested to read it.
I've summarised a few of the issues the ordered nature resolves:
Could you point me to this? I know there was discussion about the issue with |
From a message explaining why a spec that just turns everything into Any in such types would be harmful to users: https://discuss.python.org/t/take-2-rules-for-subclassing-any/47981/5 It's also not even correct to do so. class A:
foo: int
B = Any
C: type[Intersection[B, A]] = ...
c: Intersection[B, A] = C()
c.foo # int The ordering was only ever a bad compromise, and the reason Any was shuffled to the end in the attempt at that was to specifically avoid this issue. foo here can't be Any, as any type that isn't |
Any isn't a structural type. It's a concept that says "some compatible type is present here". I cannot stress enough how important the different between a higher order concept to express an uknown is different from a structural type that expresses a structure. Casting to and from Any is a free operation that doesn't require an explicit cast or checking, the type system assumes compatibility as that's the purpose. You don't have to break this down into "Easier to implement", type checkers already need to have the facilities to handle subtyping relationships, and doing so is causing important things to be gotten wrong, not all simplifications preserve the truth. |
@DiscordLiz Thanks for the above - I'm working on something to express the same concepts in the current PEP in terms of subtyping, just a quick query on this bit: class A:
foo: int
B = Any
C: type[Intersection[B, A]] = ...
c: Intersection[B, A] = C()
c.foo # int
Does this have to be the case? I don't see why C: type[Intersection[B, A]] = ...
c: Intersection[B, A] = C()
c.foo # Any
D: type[Intersection[A, B]] = ...
d: Intersection[A, B] = D()
d.foo # int What's wrong with the above? |
Any allows more than int, this is an unsafe widening of the type and should not be done by the type system. If C is a subtype of A, then C.foo can't not be compatible with int. Any is special. It's compatible but the way it is compatible is a problem to do this kind of transformation with in a way that looses known type information, becase now from the type system perspective, you're allowed to assign anything to it, even things which break the rules for things that are a subtype of A |
I want to avoid retreading too much ground here, but with a simple modification: class A:
def __init__(self, x: int) -> None:
pass
B = Any
C: type[Intersection[B, A]] = ...
c: Intersection[B, A] = C(1)
This statement applies in the case I originally gave, but in the case above that doesn't necessarily apply (As
The ordering effectively shifts the responsibility to the user to make sure they don't create effects like this unless that's what they want to do. If the result should always have an attribute |
We established we can't ban Any from intersections because the user might not have control over where they come in (typevars) The user does not have that capability to avoid this problem. The prior ordered form avoided this by moving Any last, always, but that was flawed. You're doing something far worse with this, leaving a loaded gun pointed all all users of gradual typing. |
And even if the user could, why should the tools that are meant to get typing right intentionally get it wrong? That means users need to be more of an expert on type checking than type checker maintainers. |
So for typevars consider the following: There are two ways the below function could be implemented: class X:
foo: int
T = TypeVar("T")
def foo(x: T) -> Intersection[X, T]:
... In the above if the function is used with Any, then the result will always have the attribute foo. class X:
foo: int
T = TypeVar("T")
def foo(x: T) -> Intersection[T, X]:
... In the above if the function is used with Any, the result may not have the attribute foo, if T overwrites this. What I'm saying is that offering these options means the function designer chooses which way they want to go with it. They may not control the type of T, but they choose the order in the intersection.
I'm not so convinced that there is a correct order for these objects. And the PEPs how to teach section puts a large emphasis on the ordering being integral to the way this is taught, so really to use it at all a user would be familiar with this concept. |
Is there a particular use case you're imagining here that would negatively impact these users? I've thought about quite a few different ways this could be used, but so far haven't found anything drastic - I'd be interested to know if there's something I haven't considered. |
Returning an intersection to a user, as in mixins and class decorators, the only cases we have for intersection that can't already be solved with just protocols. |
I think this should be required reading for anyone working on this, as it is another language with gradual typing adding a full type system. https://elixir-lang.org/blog/2023/09/20/strong-arrows-gradual-typing/, running into the same problems we have with Any (with Elixir's dynamic), and finding solutions that preserve useful types for users. I also think I did a bit of a disservice by stepping away when I did without better detailing the problems of the ordered intersection, but I needed to take a few days for myself. It is possible to reduce typechecker complexity to remain linear rather than polynomial here without introducing an ordering or giving up on the subtyping-based outcome, it's just going to be more complicated to work on. That benefit was not unique to having an ordering, it was just more obviously available as an easy optimization that could be made if the ordering was sound. The ordering isn't sound. If you want to add intersections without user denotable type bounds, cutting to the chase:
With user-denotable type bounds, there's more powerful ways to express this that allow avoiding some sharp edges around Any. |
@mikeshardmind Thanks, I'll have a read
I can see that there might be a way to resolve an unordered intersection in linear time, but the simplicity of implementing an ordered intersection is still there. As you noted in your closing points though the
We're also back to this whole thing of evaluating when Callables are compatible, which may be not trivial. Secondly, I'm not sure how this intersection of callables would be implemented, we end up back at how to generate overloads when merging classes.
As far as I can tell there's a whole heap of issues that get solved with ordering, and the problems it creates seem pretty minimal. At least, I haven't seen any compelling examples against it. |
It's simple, and in trying to map this to overloads we (myself included) made it more complicated than it needs to be.
I don't mean to be rude here, but the entire reason why there was such a long discussion on Any & T, as well as why the attempt to come up with a clever way to hide the problem with reordering Any before, were the problems with this. There's no way to discuss this without leaving out prior nuanced arguments or feeling like we're rehashing ground that I imagine people are tired of rehashing as I certainly am. There are multiple reasons, and I'm not going to rehash all and the rationale of each or provide specific cases
The only group not negatively impacted by this would be the people who do not use typing at all, and I'm not even sure that's true, as they may end up with worse libraries they use, but don't use the typings of, especially if library authors have to do gymnastics to make this work how they expect it to. The only benefit to this appears to be making it easier to implement. While making something easier to implement is usually a laudable goal, care must be taken to ensure that it does not consume the usefulness of what is being implemented. I can understand wanting to have something to show for this and get it done, but getting this wrong could be disastrous with how many things this has the potential to interact with. Other than HKTs, this is probably the single most complex thing that is being seriously proposed for python's type system. It's not an easy task to make sure everything we have to consider is being given proper weight. With all of that said, I don't want to say any of this and leave you without a way forward either, so.... I think there are only 3 paths forward that remain viable right now, but if you find another we can certainly discuss it.
My opinion is that while any of those 3 would be fine,
|
@mikeshardmind Thanks for breaking this down, appreciate it. So I'll park the discussion about Any for now - although I don't understand why expanding to Any where more information is available is so critical, I can accept that this may be a requirement. The path forward I'm not sure I understand why it was rejected is the one you were proposing in the previous PEP edition (where Any is the lowest priority). I appreciate the Based on certain conditions, if
Also, I'm still not sure how this could be resolved in an unordered intersection, which is why I'm keen to resolve issues with the ordered one. Banning an intersection with clashing |
I determined that getting the special casing right on this would be harder and more fragile than it was worth. (specifically, it would be harder than getting the full theory-drive intersection correct while keeping it useful to a wide audience, and would have issues of fragility) Every level of special casing needed here makes it more and more complex, harder to reason about, and more sensitive to other changes/additions to the type system. The moment it was shown that simply re-ordering Any to the end was insufficient, any additional layered special casing was more complex than just doing it "right", the ordering was meant as a means to approximate the same behavior without needing as much to be specified to simplify things and it stopped being simpler as we discovered more issues, which I took as a sign (for myself) to take a break and come back to it later.
clashes of constructors ( |
@mikeshardmind I see, that makes sense, but I'm not sure I agree.
The way I see it, there are really only two special cases we've identified so far for an ordered intersection: If we look at the special cases for unordered intersection on the other hand, there are many: Special cases will arise in any specification like this, but the ordered version seems to have fewer and is easier to explain the logic for, as for any conflict there is a default way to resolve it. In the unordered version, everytime two classes "compete" for the same attribute or method, a resolution method has to be determined, so by it's nature it must be more complex to specify. |
I also think it's worth noting that waiting for a lot of pre-requisites to be resolved internally is quite unrealistic. Even if right now all of the pre-requisites appear on a TODO list, any one of these could be delayed, and it's unlikely all of them will come to fruition. This is a tool that I personally think could be very useful, and we're incredibly close to a specification that would be implementable. It seems a shame to me to park the discussion for a future point that will in all likelihood never happen. |
None of these require special-casing the specification with option 1 or 2 of the 3 options I listed for going forward
Only option 2 in the list of 3 options above requires temporarily shelving the discussion. Shelving it for that is to prevent other unfixable issues down the road that are similar to what we are facing now with decisions about the type system intentionally ignoring LSP because that was deemed easier. If intersections got accepted with ordering rather than with either purely subtyping (possible now) or with denotable lower and upper bounds and constraints (would require groundwork), I think it would create a situation where HKTs would be impossible to support properly, and that's a gigantic detriment. HKTs are significantly more important to people than intersections are, they are also just more complex and not as easy to tackle in the current landscape. If you think this is something that needs to be finished now, go with the subtyping approach then. |
Ok, let's suppose we go with option 1 with the subtyping rules. If something is a subtype of
This is interesting. HKTs would be cool - is there any possibility of this happening in reality though? If there is and an ordered intersection presents a blocker (although I don't see why) then that is something to consider. I guess my follow up is why does it present a blocker? My main reasoning is that if the only real blocker to an ordered intersection is that I also still don't really understand why from typing import Any, cast
class A:
def __getattr__(self, item: str) -> int:
return 1
B = Any
class C:
foo: str
x= cast(A & B & C, type("D",(B,A,C),{})())
x.foo # str I suppose
x.bar # What type goes here? |
It was my intent to after taking some more time for myself, to dig into this vs explicit denotable lower and upper type bounds, I still intend on that at some later point, but we painted ourselves into a corner we didn't need to with some of the original discussions, and looking elsewhere for answers and perspective on why certain attempts at simplification were misguided and where they fell short of making things simpler has helped
This, just like callable is simpler than we made because we were looking at it through the wrong lens at the time. The thing I linked about how Elixir is solving this is very illuminating as to how we framed the discussion made finding a solution harder than it should have been. We were working under the assumption that we must provide a single resulting type and that that type itself could not be an intersection, but there's no actual rule that that be the case. It's irreducible in most cases, and that includes the attributes. But that makes things Easier not harder if context is preserved properly. A.foo & B.foo can't get resolved to a single non-intersection type unless A.foo and B.foo are trivally overlapping (ie A.foo is exactly B.foo) or only one of A.foo and B.foo exist. When using x.foo where x is a subtype of A & B, x.foo's use must be consistent with both A.foo and B.foo This preserves both what those in favor of 4 wanted (a wider upper bound due to the presence of Any) and preserving the lower bound provided by any known specific type (those in favor of option 5, as well as the use case of indirect users via language servers). It's also significantly simpler to implement as it doesn't require a polynomial algorithm in the case of resolving certain types, and instead only a linear one at each usage.
Because the rules for We keep the existing rules for when
Continues to be brought up with varying interest, it's significantly more likely with the new type alias syntax and type variable defaults, as the applications are more accessible and natural to more developers, and it's been stated as a priority for a few people heavily involved in python's typing. I'd say that yes, there's a real possibility of this happening.
I really don't want to try and work through an example of this right now, there's still a lot of unknowns. I forsee an issue with each an intersection of higher kinded types, and with intersections within higher kinded types if the behavior is inconsistent with other subtyping. I can't definitely say that this would be unresolvable for HKTs, but I can definitively say that every special case that exists will increase complexity of HKTs and that any single one of those cases could potentially be the straw that was too much. There's too many unknowns with it right now, and it hasn't been explored enough to know. I'm taking the pragmatic stance that if we don't need new special cases, we should do it without them to reduce the number of things that don't "simply work" together. |
|
||
class A: ... | ||
class B: ... | ||
Specification |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we have to specify this in terms of subtyping and consistent subtyping:
T
is a subtype ofIntersection[A, B]
ifT
is a subtype of bothA
andB
;Intersection[A, B]
is a subtype ofT
if eitherA
orB
is a subtype of T.
I will omit the consistent subtyping rules here, as I think there is still no agreement as to how gradual types should be treated.
This approach has a benefit of being precise and also free from unnecessary details: we don't have to talk about attributes, methods, subclassing, structural types etc etc, as these are also part of the subtyping relations and are specified elsewhere.lasses/protocols etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose I would say the details are often the places where these need resolving - e.g.
The definition of Protocol may specify a subtype must have method x of some form.
If two Protocols both have method x of different forms, we arrive from this definition that a subtype of both must have method x in both forms - it's not clear how to interpret this (hence why conversation on this particular point led to the overload debate).
|
||
So one possibility to fulfill an intersection is for a class to be a child of all intersected classes. | ||
A "structural type" is considered to be one which cannot be mixed via inheritance to a regular class. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This definition does not follow from PEP 544. In fact, subclassing structural types in the PEP 544 sense (i.e. subclasses of typing.Protocol
) is valid.
Can you clarify why you went with "structural type" here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This came out of some of the discussions, I was looking for a way to clearly separate the classes that don't require inheritance, and this seems like quite a useful existing definition.
issubclass(C, A & B) # True | ||
An empty intersection is considered to be invalid, as it does not satisfy the first rule. | ||
However, it is possible for this to occur in ``TypeVarTuple`` expressions like | ||
``Intersection[*Ts]``. In these cases an empty intersection would resolve to ``typing.Never``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not say that an empty intersection is always typing.Never
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a general rule, we would expect Union to behave in the same way - I believe under the current specification Union[]
is not valid so this seems like the correct dual. There is an interesting point here - I've just realised that unpacking TypeVarTuples is not supported for Union
, so perhaps it also should not be supported by Intersection
.
|
||
:: | ||
``Intersection`` does not forbid any incompatibility of type parameters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what this is trying to say. Can you clarify, please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
e.g. Intersection[int, str]
would still be valid, even if in practice it would impossible to create.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just FYI this is not how it works in Mypy currently. The intersection with isinstance of something like
x = ''
if isinstance(x, int): ...
Will lead to a Subclass of "str" and "int" cannot exist: would have incompatible method signatures
. I don't care too much about this, I just wanted to let you know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I imagine mypy just mirrors the CPython implementation detail, where str
and int
cannot be subclassed at the same time?
It is important to note that once a type checker evaluated anything to ``Never`` within an | ||
intersection it can stop further evaluations an return ``Never``. | ||
This way a lot of edge cases by mixin types that can't be mixed are handled easily. | ||
Inheritance |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we have to spell that out separately, as it follow from the subtyping rules.
However, it will be useful to clarify whether an intersection is allowed a second argument to isinstance()
and issubclass()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it's a good point re isinstance
and issubclass
. I think based on some initial tests with Union, I would say these are valid:
class A:
pass
class B:
pass
x = A()
class Y(A):
pass
print(isinstance(x, A | B))
print(issubclass(Y, A | B))
Both of these are valid and return True, therefore it follows that these would also be valid and return True:
class A:
pass
class B:
pass
class Y(A, B):
pass
x = Y()
print(isinstance(x, Intersection[A, B]))
print(issubclass(Y, Intersection[A, B]))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably shouldn't have isinstance or issubclass behavior. General rule is that parameterized generic forms aren't valid 2nd arguments. Union support was special-cased and this was later considered a mistake.
Intersection types allow to succinctly combine multiple protocols (see PEP-544) into a single | ||
structural type. | ||
For example, instead of | ||
Protocols can now be combined in any way, to produce a new type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"in any way" here is potentially misleading, since later in the PEP you talk about order being significant.
Is it an error for the two protocols to have a non-zero API overlap? What should type checkers do in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah so I suppose I meant they can be combined in any way in the sense, any order is valid, any combination of overlapping protocols is valid - perhaps this is unclear.
Is it an error for the two protocols to have a non-zero API overlap? What should type checkers do in this case?
This is where ordering really comes into it's own - if two protocols have an API overlap, one of them takes priority and this method appears in the intersection. This is really what I meant about "in any way" - there are no invalid Protocol combinations.
[Clearly explain why the existing language specification is inadequate to address the problem that | ||
the PEP solves.] | ||
|
||
This allows: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OOC why did you decide to leave mixing out? They are distinct from protocols in that they are supposed to be subclassed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just covered by other parts of the specification - the rules on inheritance specify this is required, but if an example would make things clearer we can definitely add one.
|
||
Protocol Reduction | ||
------------------ | ||
Introducing ordering has many benefits, including the fact that it simplifies and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are the implications of ordering for subtyping relations? Is A&B
a subtype of B&A
? Is it a consistent subtype of B&A
? What if we have A&Any
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say A&B
is a subtype of B&A
if A
and B
differ (have no overlapping non dunder properties) or the overlapping properties are identical.
If they have different properties they aren't identical.
If I got the ordering correct A & Any
is kind of Any
because Any
overwrites everything.
But for Any & A
is all properties of A
but everything that is not a property of A
is Any
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If A
and B
are classes they are very likely to have a non-empty difference, including e.g. __init__
.
I missed 99% of prior discussion, so it is very likely this has been covered already, but I find the idea of ordered intersections very confusing. I can see how unordered intersections are problematic, but I personally don't think the extra complexity of ordered intersections is justified.
Also worth noting that I'm looking at this from the IDE angle. And there you ~have to treat A & Any
as an unknown subtype of Any
, which never produces any static type errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Point of clarity, it needs to be treated as an unknown subtype of each of A
and Any
, and some things within that can produce static type errors where being a subtype of A provides enough information. This came up in prior discussion, was proven to be so, and over on discourse, when discussing subtyping of Any, ignoring the other supertypes was loudly untenable for language servers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for chiming in Michael. Like I said, I'm sure this has been discussed before. Can you link me to the relevant Discourse thread to avoid repeating the argument here?
My gut feeling, still, is that intersections are best left unordered in the PEP, because the ordering-aware design is complex, unintuitive and does not match the behavior of intersection types in other mainstream languages (e.g. TS and Java/Kotlin).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree on them needing to be unordered. The ordering attempts were (both my previous ordering attempt which I went back to the drawing board on, and Mark's here) attempts to simplify the specification with an approximation of behavior that should be equivalent, it doesn't work though The discourse thread starting here https://discuss.python.org/t/take-2-rules-for-subclassing-any/47981/5 Has Eric's objection to ignoring the other bases with multiple supertypes, this with multiple inheritance involving Any, rather than intersections involving Any, but the problems are the same here.
If you don't end up reading the whole thread, this section of my post has a way to handle this that works https://discuss.python.org/t/take-2-rules-for-subclassing-any/47981/7#further-detailing-lsp-compatible-any-and-type-bounds-2 for both intersections and multiple inheritance, but I was steered away from it at first to try and get a smaller change to work, we found that the smaller change did not work.
I don't think TS will be a good model here (not rehashing that right now either) and we want to look at Kotlin and Elixir. In particular, Elixir has an in-progress handling of this that solves this well even with dynamic()
by considering function domains. https://elixir-lang.org/blog/2023/09/20/strong-arrows-gradual-typing/
This gives us an answer for gradual callables/methods in an intersection:
(int) -> int & (Any) -> Any
Here,, we can compare function domains and arrive at....
((int) -> int & Any) & ((Any & ~int) -> Any)
I think the best way forward is just the subtyping rules paired with an assumption that since Any is for gradual typing and not yet-expressible concepts, Any can't contain something that would be a type error if expressed, and only represents a compatible unknown.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the best way forward is just the subtyping rules paired with an assumption that since Any is for gradual typing and not yet-expressible concepts, Any can't contain something that would be a type error if expressed, and only represents a compatible unknown.
Yeah, I like this. Do we need to build consensus before refining the PEP in the way you described?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to build consensus before refining the PEP in the way you described?
Probably should try to, otherwise we have competing approaches being worked on simultaneously.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess my main point of contention here is that the subtyping rules seem insufficient to describe all cases without a lot of further clarification. Or at least, whenever I try to apply them to some of the examples it leads to situations that don't make sense without further resolution.
My gut feeling, still, is that intersections are best left unordered in the PEP, because the ordering-aware design is complex, unintuitive and does not match the behavior of intersection types in other mainstream languages (e.g. TS and Java/Kotlin).
The "not matching other languages" part is a fair enough comment, but I actually think in the context of python ordered intersections are easier to understand than unordered ones. Perhaps I haven't explained it well, but really by introducing ordering any edge case that might be encountered immediately has an intuitive explanation - and there are analogies to explain it.
I think really the issue here is about expectation - it might be that this specification behaves in a way that the community finds unexpected, but I guess my hope is that by explaining it clearly this can be resolved. @superbobry Did the "how to teach" section make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
subtyping rules seem to work fine for me, check latest comment in #17 for how this works for callables
A "structural type" is considered to be one which cannot be mixed via inheritance to a regular class. | ||
(See PEP 544 for details) | ||
Some of the structural types are: | ||
- ``typing.TypedDict`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume you are referring to subclasses of typing.TypedDict
and typing.Protocol
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that's correct, I can rephrase to make this clearer
**⚠️ NOTE: This is document is work in progress, help wanted, see https://github.com/CarliJoy/intersection_examples/blob/main/README.md** | ||
|
||
|
||
This following PEP was written originally on the PyCON US by Kevin Millikin (@kmillikin) and Sergei |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor clarification: it wasn't really written on PyCon US :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But triggered?
For the sake of making the processes on how PEPS envolve I would like to keep the reference to the PyCON (but more accurate).
IMHO we should be transparent how community works and make it clear, especially for this PEP, that it went a looonnnngg way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really. Kevin and me were working on a Python language server, and we needed intersections to faithfully model the types arising in control flow.
|
||
Type narrowing in control flow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this section is important, since type checkers are likely to refine type information based on control flow.
Can you clarify how ordering works with control flow. For example, given
def f(x: Any):
if isinstance(x, T):
x # is the type here Any&T or T&Any?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure - this is an interesting example!
(removed and modifying as I realised something)
Yeah this is really spicy. I'm going to discuss it in the context of concrete classes, as it's easier to reason about and the same rules apply:
class A:
pass
class B:
pass
def f(x: A):
if isinstance(x, B):
x # is the type here A&B or B&A?
Here's the weird part - being an instance of B doesn't necessarily mean all of B's methods are exposed. A could have overwritten some. I actually think the type here is as follows:
Intersection[A,B] | Intersection[B,A]
A nice feature of the ordered intersection is that it's expressive enough to also become an unordered intersection where needed. Alternatively this could be treated as is (in VS Code the type here is:
<subclass of A and B>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to me that the desire for ordered intersections arises from a desire to resolve all intersection types eagerly, and not have intersection types propagate. But I think this is not correct.
The type of (A & B).foo
does not need ordering to decide whether A
or B
"wins" -- the type of (A & B).foo
is simply A.foo & B.foo
, always. If A.foo
and B.foo
are related as subtypes (this includes "same type"), then this intersection will immediately simplify. If A.foo
and B.foo
do not have a subtype relationship, then we cannot simplify the intersection, and it propagates.
PR relating to #48. Still some work to be done on this, but feel free anyone to have a review if you'd like @CarliJoy @erictraut @NeilGirdhar @gvanrossum @DiscordLiz @vergenzt . It builds a lot on the work of @mikeshardmind, but I've tried to thin it down a bit and add some more examples. It also approaches the problem from a slightly different angle conceptually, but the result is somewhat similar.
Broadly the main update I'm planning is to finish the intersect simulator and link it as the reference implementation (Proof of Concept I suppose), then we can move to opening a discussion related to the PEP on python discuss.
I'll also add something discussing the equivalence of
Callable
to certain Protocols and maybe add an example.Finally, this issue is still outstanding: https://discuss.python.org/t/take-2-rules-for-subclassing-any/47981, but my hope is that this may not be critical to finalising the PEP. The behaviour as far as Intersection is concerned is now well defined under the PEP, and I'm thinking that if this proves to be a major issue perhaps it will arise during later discussion.