Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updates for PEP 544: Protocols #243

Merged
merged 6 commits into from
Apr 22, 2017

Conversation

ilevkivskyi
Copy link
Member

This implements two sets of comments by Guido, comments from python-dev, and adds links to working implementation.

Attn: @JukkaL @ambv @gvanrossum

Copy link
Member

@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly just a bunch of grammar nits.

pep-0544.txt Outdated
@@ -263,7 +263,9 @@ an *explicit* subclass of the protocol. If a class is a structural subtype
of a protocol, it is said to implement the protocol and to be compatible
with a protocol. If a class is compatible with a protocol but the protocol
is not included in the MRO, the class is an *implicit* subtype
of the protocol.
of the protocol. (Note that one could explicitly subclass a protocol and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could -> can

pep-0544.txt Outdated
Static methods, class methods, and properties are equally allowed
in protocols.

To define a protocol variable, one must use PEP 526 variable
To define a protocol variable, one could use PEP 526 variable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could -> can

pep-0544.txt Outdated
Alternatively, one can implement ``SizedAndCloseable`` like this, assuming
the existence of ``SupportsClose`` from the example in `definition`_ section::
Alternatively, one can implement ``SizedAndClosable`` protocol by merging
the ``SupportsClose`` protocol from the example in `definition`_ section
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... in the definition section ...

pep-0544.txt Outdated
Note that ``Protocol[T, S, ...]`` is allowed as a shorthand for
``Protocol, Generic[T, S, ...]``.
``Protocol[T, S, ...]`` is allowed as a shorthand for
``Protocol, Generic[T, S, ...]``. Declaring variance is not necessary for
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Start new paragraph at "Declaring variance ..."

pep-0544.txt Outdated

x: Box[float]
y: Box[int]
x = y # This is OK due to inferred covariance of the protocol 'Box'.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...due to the inferred covariance ...

I'm also not sure how covariance is inferred here. Could you give examples of inferred covariance, contravariance, invariance?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the content attribute is mutable, Box should be invariant? And for consistency with normal classes, I'd prefer variance to be explicit. Especially if a generic protocol inherits from other protocols, it can be unclear whether a particular type variable should be covariant or invariant, so making this explicit would make code easier to understand. Users are often confused by variance, so it's likely that they often can't predict what the inferred variance would be. And what if I'd like to explicitly make something invariant for some reason, even if inference thinks it can be covariant?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JukkaL
Yes, Box should be invariant, completely forgot about "mutability" in my implementation. Fortunately, this is easy to fix, I will just check subclassing both ways for members with IS_SETTABLE flag (basically treating a settable member as a pair of getter/setter methods).

And for consistency with normal classes, I'd prefer variance to be explicit.

OK, I agree, I will just remove the text mentioning that variance could be omitted.

And what if I'd like to explicitly make something invariant for some reason, even if inference thinks it can be covariant?

Good question! I am not sure what to do here. Currently, I do two-step checking:

  • First check nominal subtyping using user-declared variance, if it succeeds, then OK, otherwise go to next step.
  • Try structural subtyping. Here I check compatibility of every member (types and flags like IS_SETTABLE, IS_CLASSVAR, etc) I think this is quite robust, so that the inferred variance should be reliable (unless I am missing something).

In principle, I probably could always use the user-declared variance, but this will have two downsides:

  • 90% of users use invariant variables, so that protocols will be interpret as invariant. For nominal classes, this makes perfect sense, but for protocols, where we anyway do a lot of work checking all members, we could reliably infer variance, and making them all invariant will be quite sad.
  • Less important, but this will also make implementation more complex. Currently, when I check is_subtype(C, Proto[int]), I just substitute T with int in Proto and check all members. With user declared variance I will first need something like map_instance_to_supertype and then use variance.

@@ -576,7 +610,7 @@ Example::
cached_func((1, 2, 3)) # OK, tuple is both hashable and sequence

If this will prove to be a widely used scenario, then a special
intersection type construct may be added in future as specified by PEP 483,
intersection type construct could be added in future as specified by PEP 483,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't help congratulation on this correct use of "could". :-)

pep-0544.txt Outdated
In Python 2.7 the function type comments should be used as per PEP 484.
The ``typing`` module changes proposed in this PEP will be also
backported to earlier versions via the backport currently available on PyPI.
Also the function type comments could be used as per PEP 484 (for example
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Also function type comments can be used ..."

pep-0544.txt Outdated
backported to earlier versions via the backport currently available on PyPI.
Also the function type comments could be used as per PEP 484 (for example
to provide compatibility with Python 2). The ``typing`` module changes
proposed in this PEP will be also backported to earlier versions via the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

be also -> also be

pep-0544.txt Outdated
def fun(m: Mapping):
m.keys()

The question is should this be an error? We think most people would expect
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly I don't understand why this is even a question. OF COURSE it is valid, keys() is part of the Mapping class/protocol/ABC.

I do think I understand what's going on here, you're considering that keys() is a non-abstract method in Mapping, and the question is whether non-abstract methods are considered part of the protocol by default. I just think the example is a poor one (unless one happens to know that keys() has a default implementation in the Mapping class). So I think you just have to switch to an example that doesn't rely on a standard ABC.

pep-0544.txt Outdated
that could be considered "non-protocol". Therefore, it was decided to not
introduce "non-protocol" methods.

There is only one downside for this: it will require some boilerplate for
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for -> to

pep-0544.txt Outdated

x: Box[float]
y: Box[int]
x = y # This is OK due to inferred covariance of the protocol 'Box'.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the content attribute is mutable, Box should be invariant? And for consistency with normal classes, I'd prefer variance to be explicit. Especially if a generic protocol inherits from other protocols, it can be unclear whether a particular type variable should be covariant or invariant, so making this explicit would make code easier to understand. Users are often confused by variance, so it's likely that they often can't predict what the inferred variance would be. And what if I'd like to explicitly make something invariant for some reason, even if inference thinks it can be covariant?

pep-0544.txt Outdated
Continuing the previous example::

class SimpleTree:
leaves: List['SimpleTree']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since leaves is a mutable attribute, SimpleTree doesn't look compatible with Traversable. If the attribute in Traversable would be a read-only property, this would be fine.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good catch!

pep-0544.txt Outdated
protocol types. For variables and parameters with protocol types, subtyping
relationships are subject to the following rules:
Protocols cannot be instantiated, so there are no values whose
*exact* type is a protocol. For variables and parameters with protocol types,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe use 'runtime type' instead of 'exact type' as it's less open to interpretation.

pep-0544.txt Outdated

* A protocol is never a subtype of a concrete type.
* A concrete type or a protocol ``X`` is a subtype of another protocol ``P``
* A concrete type ``X`` is a subtype of protocol ``P``
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe also mention that the implementations must have compatible types.

pep-0544.txt Outdated
tree: Tree[float] = Tree(0, [])
walk(tree) # OK, 'Tree[float]' is a subtype of 'Traversable'
* A protocol ``P1`` is a subtype of another protocol ``P2`` if ``P1`` defines
all protocol members of ``P2``.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, maybe mention that the members must have compatible types.

pep-0544.txt Outdated
One could argue that protocols typically only define methods, but not
variables. However, using getters and setters in cases where only a
simple variable is needed would be quite unpythonic. Moreover, the widespread
use of properties (that are often act as validators) in large code bases
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'thar are often act' -> 'that often act'. Also, I'm unsure what 'act as validators' means here. Maybe rephrase or give an example?

pep-0544.txt Outdated
implicit subtypes of ``Mapping`` and few other "large" protocols. But, this
applies to few "built-in" protocols (like ``Mapping`` and ``Sequence``) and
people are already subclassing them. Also, such style is discouraged for
user defined protocols. It is recommended to create compact protocols and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe write 'user defined' as 'user-defined' (with a hyphen)?

pep-0544.txt Outdated
Prohibit explicit subclassing of protocols by non-protocols
-----------------------------------------------------------

This was rejected mainly for two reasons:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are additional reasons:

  • Explicit subclassing makes it explicit that a class implements a particular protocol, making subtyping relationships easier to see.
  • Type checkers can warn about missing protocol members or members with incompatible types more easily, without having to use hacks like dummy assignments.

pep-0544.txt Outdated
Backwards Compatibility
=======================

This PEP is fully backwards compatible.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we turn Sequence and friends into runtime protocols, the results of some isinstance checks are going to change in some edge cases. (Also, the performance of these isinstance checks might be affected, though not sure about it.)

@ilevkivskyi
Copy link
Member Author

@gvanrossum @JukkaL Thank you for your reviews! I agree with all points and will implement them soon. The only open question is about user-defined vs inferred variance for protocols, see above.

@ilevkivskyi
Copy link
Member Author

ilevkivskyi commented Apr 20, 2017

It looks like I have another argument while we should go with inferred variance instead of user-declared. While thinking about Box[T] invariance because of mutable member, it came to my mind that this:

class B:
    attr: float

class C(B):
    attr: int

is strictly speaking is not safe, and should be prohibited. But we still allow this for practical reasons. Maybe we can therefore also allow covariance for Box?

But more widely, here is the argument: if we go with inferred variance, then there is a very simple mental picture, a class C implements Proto[int] if and only if it can be an explicit subclass of the latter. Jukka, what do you think?

@JukkaL
Copy link
Contributor

JukkaL commented Apr 21, 2017

Covariant overriding of attributes should be rejected (python/mypy#3208). Not sure why it's allowed -- may have been an oversight. Also, I'm not sure I actually understood correctly what you meant by inferred variance. It's worth specifying this in more detail, since now it's open to interpretation.

@ilevkivskyi
Copy link
Member Author

ilevkivskyi commented Apr 21, 2017

@JukkaL

Covariant overriding of attributes should be rejected (python/mypy#3208).

OK, agreed.

Also, I'm not sure I actually understood correctly what you meant by inferred variance. It's worth specifying this in more detail, since now it's open to interpretation.

I was referring to this discussion #243 (comment)

Here are some examples:

class ProtoCo(Protocol[T]):
    def meth(self) -> T:
        ...
class ProtoContra(Protocol[T]):
    def meth(self, arg: T) -> None:
        ...
class ProtoInv(Protocol[T]):
    def first(self) -> T:
        ...
    def second(self, arg: T) -> None:
        ...

In current implementation, ProtoCo[int] is a subtype of ProtoCo[float], and ProtoContra[float] is a subtype of ProtoContra[int]. This happens "naturally" in my implementation without additional effort, this is what I call the "inferred variance".

I would like to keep this behavior instead of treating all the above protocols as invariant. It fits a simple mental model (if Proto[A] can be an explicit subclass of Proto[B], then it is a subtype). Also there are additional arguments in #243 (comment)

What do you think?

Copy link
Member

@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All my comments have been addressed! I'm not sure if we should merge this now and continue the discussion about implicit/explicit variance in python-ideas or python-dev, or if we should hash it out here until we're all in agrement. @JukkaL what do you think?

@ilevkivskyi
Copy link
Member Author

I just made a quick search on approaches in other languages. Here is a brief summary:

  • Go: interfaces are always invariant (and people complain about this).
  • TypeScript: interfaces are bivariant, but right now there is a discussion about switching to inferred variance.
  • C#: interface variance is user declared.
  • Java: question is not applicable, there is only use-site variance.

It looks like there is no wide consensus between different languages about this. In the meantime I have noticed that most people who complain about default invariance of interfaces are not aware about variance, they just feel naturally that their AnInterface<Sub> is a subtype of AnInterface<Super>, i.e. they intuitively want covariance in this case. Therefore, I think that going with inferred variance will actually reduce the "decision burden" for users.

@gvanrossum gvanrossum merged commit 6024eea into python:master Apr 22, 2017
@gvanrossum gvanrossum deleted the protocol-updates branch April 22, 2017 14:43
@gvanrossum
Copy link
Member

Thank you for the research. I have merged this version, since it is your PEP and it is your proposal to use implicit variance. Then we can have an additional discussion without this PR going stale.

@gvanrossum
Copy link
Member

FWIW my own opinion on inferred invariance is wishy-washy. Looking at your three examples I could easily see evolution of a protocol that is currently covariant accidentally make it invariant. Contravariance is also counter-intuitive (whenever it comes up I always have to derive the reasons from a basic example).

I think two situations are useful:

(a) Invariant protocols; sometimes there are good reasons for this (e.g. they represent interfaces to mutable objects, like MutableSequence). We'll have to explain forever that MutableContainer[] is no a subtype of MutableContainer[] but so be it, and the reason is a good one -- my assumption is that basically everyone has to bump into this once and then they know.

(b) Covariant protocols; these are easier to use (they are more intuitive somehow, since everyone somehow expects that List[] is a subtype of List[]). But they cannot represent mutable objects.

My worry with default inferred variance is that people won't understand why adding a certain method to their (previously covariant) protocol suddenly makes it invariant, hence breaking some code that's unchanged. Had variance been explicitly declared they would have been forced to think about this sooner and it would be no surprise that they can't add a method that violates the declared variance. (No action-at-a-distance, or in this case error-at-a-distance.)

There's an additional concern. I think it's come up in the attempts to make Mapping covariant in its keys. This is disallowed because arguments need to be contravariant (this is the most counter-intuitive part of the whole thing). It is annoying because there seems to be no way to create the unsafe scenario for Mapping that justifies the ban on covariant arguments. But why can't the type system tell the difference? Frankly this baffles me.

@ilevkivskyi
Copy link
Member Author

Here are some random thoughts:

  1. Action-at-a-distance is typical for structural subtyping, most changes in protocol method signatures will break some distant code, whether it is accompanied by a variance change or not. I think most users understand this, everyone who changes e.g. Iterable expects that a lot of code will break, and one will do such things only for a good reason.

  2. There are still warnings at protocol declaration site for obviously incorrect use of type variables:

    class Proto(Protocol[T_co, T_contra]):
        def one(self, x: T_co) -> None: ...  # Error!
        def other(self) -> T_contra: ...  # Error!

    the semantics is practically only changed for plain T, it is invariant for nominal classes, but "auto" for protocols.

  3. Mapping invariant in key is a "guard" against excessively inventive users. For example, this will pass type check with Mapping covariant in key type, but will fail at runtime:

    def fun(m: Mapping[float, int]) -> int:
        return m[0.1]
    
    class StrangeMap(Mapping[int, int]):
        def __getitem__(self, item: int) -> int:
            return item << 5
    
    fun(StrangeMap())  # Boom!

    IIRC, there was a discussion at some point about ignoring users who write such code, but finally we decided not to do this.

  4. Currently, bivariant types are not supported at all, but actually the protocols PR correctly detects bivariant types, e.g. Container as it is currently defined in typeshed, without any additional effort.

@ilevkivskyi ilevkivskyi mentioned this pull request May 5, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants