-
Notifications
You must be signed in to change notification settings - Fork 237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow subclassing without supertyping #241
Comments
As a follow-up to the discussion, @gvanrossum suggested using a class decorator. The problem with this approach is that class-level is not the right granularity to decide this, but each individual inheritance relation is. Ina multiple-inheritance context, you could want to inherit the interface of a SuperClassA, and the implementation of SuperClassB. I think that might be quite common with mixins (actually I think most mixins could be used with this form of inheritance) Looking deeper into what Eiffel does (but this is a bit of a side-comment, not part of the main proposal), there it was also possible to define a superclass in a way that only implementation inheritance from it was allowed, and that could work well with mixins (and implemented as a class decorator). However you still need to be able to say "I just want the implementation of this superclass" when inheriting, because the writer of the superclass may not have foreseen that you wanted that. |
Also note that this idea makes self types more viable. Things like def __lt__(self: T, other:T): -> bool ... can be defined more accurately (currently this is in the stdlib), and with no errors (note that the |
I think this is a very nice feature. In general, it is good that we separate classes (code reuse) and types (debugging/documentation). If it is added, then I would also mention this feature in PEP483 (theory) as an example of differentiating classes and types. Between a decorator and a special class I chose the second one, since as @dmoisset mentioned, it is more flexible and suitable fox mixin use cases. Also, it is very in the spirit of gradual typing, one can mark particular bases as |
I'd like to explore this further. I expect that specifying semantics that
work for every use case will be tricky. E.g. when using implementation
inheritance, how much control does the subclass have and how much control
does the superclass have? Do you have to repeat the signature for every
method? Only describe exceptions? Which use cases should be care about most?
I hope that one of you can come up with a concrete, detailed proposal.
|
I think it is better to have a minimalistic solution, rather than try to make a complex proposal that will cover all use cases. Later, more features for most popular use cases could be added. Here is a detailed proposal with examples:
class Base:
def __init__(self, number: int) -> None:
self.number = number
class AnotherBase:
...
class Derived(Implementation[Base], AnotherBase):
def __init__(self) -> None:
super().__init__(42) # This works since Implementation[Base] is just Base
from typing import cast
def func(x: Base) -> None:
...
def another_func(x: AnotherBase) -> None:
...
func(Base(42)) # this is OK
another_func(Derived()) # this is also OK
func(Derived()) # Marked as error by a typechecker
func(cast(Base, Derived())) # Passes typecheck
class Base:
def method(self, x: int) -> None:
...
class AnotherBase:
def another_method(self, y: int) -> None:
...
class Derived(Implementation[Base], AnotherBase):
def another_method(self, y: str) -> None: ... # This is prohibited by a typechecker
def method(self, x: str) -> None: ... # Although, this is OK
I think the re-check should be recursive, i.e. include bases of bases, etc. Of course, this could slow down the type-check, but not the runtime.
class Base:
def method(self, x: int) -> None: ...
class AnotherBase:
def another_method(self, y: int) -> None: ...
class Derived(Implementation[Base], AnotherBase):
def method(self, x: str) -> None: ... a typechecker infers that
If we agree on this, then I will submit PRs for typing.py, PEP484, and PEP483. |
This has the problem that a type checker doesn't see inside the implementations of stubbed classes, and you could break some of the base class methods without a type checker having any way of detecting it. Maybe implementation inheritance would only be allowed from within your own code somehow. For example, behavior would be undefined if you inherit the implementation of a stdlib or 3rd party library class, unless you include the implementation of the library module in the set of checked files. Not sure if the latter would be too restrictive. To generalize a bit, it would okay if the implementation base class has library classes in the MRO, but methods defined in those classes can't be overridden arbitrarily. |
@JukkaL This is a good point. We need to prohibit incompatibly overriding methods everywhere except for accessible in the set of checked files. I don't think it is too restrictive. Alternatively, we could allow this with a disclaimer that complete typecheking is not guaranteed in this case. |
Yeah, I'm afraid the use case from
python/mypy#1237 is not so easily addressed by
things that preserve soundness. But I don't have an alternate proposal. :-(
|
@gvanrossum Indeed, there is not much could be done. But I don't think this is bad. As with |
The QPixmap example shouldn't be a problem, since it seems to be for a library stub file. We don't verify that stubs are correct anyway, so a stub would be able to use implementation inheritance without restrictions but there are no guarantees that things are safe -- we expect the writer of the stub to reason about whether implementation inheritance is a reasonable thing to use. For user code, I think that we can provide a reasonable level of safety if we enforce the restrictions that I mentioned. |
I have a related proposal. What about having a magic class decorator (with no runtime or minimal effect) or a base class that declares a class as a "pure mixin"? A pure mixin could only be used for implementation inheritance, and it wouldn't be type checked by itself at all -- it would only be type checked as part of implementation inheritance. I think that this would allow type checkers to check some mixin use cases that are tricky right now with minimal code changes. The alternative would be to use ABCs to declare the things that the mixin expects to be defined elsewhere, but they are harder to use and arguably add boilerplate. Example:
|
But is that the way mixins are typically used? It would be good to look |
@JukkaL How about the following disclaimer to be added? """ In user code, implementation inheritance is only allowed with classes defined in the set of checked files. Built-in classes could be used as implementation base classes, but their methods could not be redefined in an incompatible way. I like you proposal of |
@gvanrossum I've seen mixins like that (that use things that aren't defined anywhere in the mixin class). I'm not sure how common they are. @ilevkivskyi Looks good. I'd do some small changes (feel free to edit at will):
-> Note that implementation inheritance could be used with stubbed extension modules, but there are no guarantees of safety in this case (type checkers could not verify correctness of extension modules). As a special case, if a stub uses implementation inheritance within the stub, the writer of the stub is expected to reason about whether implementation inheritance is appropriate. Stubs don't contain implementations, so type checker can't verify the correctness of implementation inheritance in stubs.
-> Built-in and library classes could be used ... |
Also, I'm not at all attached to the |
Maybe the puremixin proposal deserves a new issue?
|
Created #246 for pure mixins. |
We're probably talking about the same mixin classes. :-) However, I'm not convinced that those would deserve being marked as Implementation[] -- the class that inherits from the mixin typically also inherits from some other class and the mixin should not conflict with any methods defined there (though it may override them). (OTOH maybe I didn't read the examples all the way through.) |
Let's move the discussion of mixins to #246? |
I was writing a proposal and it may be a bit messy to post it now because I duplicated some stuff already said, but wrote some other things differently. Let me summarize different or subtle points that I think should be discussed:
class Base: ...
class Derived(Foo, implementation_only(Base), Bar): ... so it doesn't make sense to pass it a Typevar or anything that has no supporting class (like unions or protocols). I think the different syntax helps.
As a side note, using functional syntax would make it trivial to use |
Btw, for further reference there's a paper on the idea at http://smarteiffel.loria.fr/papers/insert2005.pdf |
There are some things that can be added to this mechanism (I left these apart because they aren't
@implementation_only
class SomeMixin:
def method(self, x: int) -> str: ...
def other_method(self, x: int) -> str: ...
class C(SomeMixin):
def method(self) -> None: ... # OK to change the signature
class C(
implementation_only(
SomeMixin, include=('method', 'attribute')
)
): ...
class D(
implementation_only(
SomeMixin, exclude=('other_method')
)
): ...
class E(
implementation_only(
SomeMixin, renames={'method': '_d_method'}
)
): ... The exclusion/renaming mechanism are present in Eiffel, and the renaming allows to unambiguously resolve issues related to super() by given explicit different names to the overriden methods, although I don't think the first proposal should include any of these, but I wanted to mention these to also clarify that the functional syntax gives more flexibility for future growth of the idea. |
@dmoisset Indeed, some things you propose have already been said. This is rather a good sign.
|
The possible problem with super happens in the following snippet: def implementation_only(C): return C
class Base:
def f(self, a: int) -> None:
print("Base")
print(a)
class Left(Base):
def f(self, a: int) -> None:
print("Left.f")
super().f(a)
class Right(implementation_only(Base)):
def f(self, a: str) -> None:
print("Right.f")
super().f(len(a))
class C(Left, implementation_only(Right)):
def f(self, a: int) -> None:
print("C.f")
super().f(a)
C().f(42) This code fails with a TypeError if run with Python. but the rules defined above don't detect a problem:
But even if everything looks ok by itself, the program has a type error, so the rules are unsound. So some of the rules above should actually be rejected in this scenario. 1 seems to be an ok rule (valid with the current semantics), and 2 is our proposal so let's assume it as valid (if we get to a contradiction the idea itself is unsound). So 3, 4, or 5 should be wrong, and we should choose (at least) one, under some conditions (which should include this scenario). My proposal was "3 is wrong. You can't call super from an implementation_only redefinition of a method". This actually will make the call that produces the TypeError in runtime to be invalid. Other options are making this scenario impossible forbidding the redefinition in some multiple inheritance cases (4 would be invalid), but it's probably harder to explain to users |
Thank you for elaborating, I understand your point. However, it looks like a more sophisticated typechecker could catch the TypeError. By re-type-checking bodies of methods in Prohibiting 3 is an absolutely valid option (although it feels a bit false positive to me), also prohibiting 4 is a valid option. What I propose is to leave this choice (sophisticated recursive re-checking vs prohibitng 3 vs prohibiting 4) up to the implementer of a particular typechecker. Restrictions mentioned by Jukka are necessary, because it is extremely difficult (probably impossible in principle) to check types inside extension modules, so that we: |
I have to say that I am uncomfortable with solutions (to any problem, not
just this one) that say "just re-type-check such-and-such code with this
new information". This could potentially slow things down tremendously
(depending on how often new information is discovered and acted upon).
It also makes things hard to explain to users -- e.g. how to explain that
their code that was previously marked as correct is now deemed incorrect,
due to something that someone else added somewhere else. Why not place the
blame on the new subclass that apparently violates the assumptions of the
base class, for example?
I also worry that such solutions violate some fundamental nature of type
annotations. When mypy sees "def foo(arg: int) -> int: ..." it type-checks
the body of foo() and then it can forget about it -- everything it needs to
know about foo() is now summarized in the function signature. Likewise for
classes. And for modules. A solution that requires mypy to go back
essentially introduces a circular dependency between (in this case) base
class and superclass, or maybe (in other examples) caller and callee, or
importer and importee.
Those circular dependencies worry me especially in the context of the work
we're doing on incremental checking in mypy -- it makes the assumption that
if a module hasn't changed, and its dependencies (taken transitively)
haven't changed either, then the module doesn't need to be re-checked on a
future run, and the information collected in its symbol table (about global
variables, functions, classes, methods and signatures) is sufficient to
analyze any new code that happens to import it. I'd like not to see this
assumption violated.
Of course it's common that circular dependencies occur at all levels --
import cycles, mutually recursive functions, what have you. And mypy deals
with them (and we're improving it for those edge cases where it fails to
deal with them). But I think I want to avoid introducing extra cycles in
relationships that users see as one-directional.
|
The point about incremental checking is a good one. General performance could also be an important concern if people would start using this more widely. My expectation would be that this would mostly be used in fairly simple cases, but of course any feature given to users tends to be used in unforeseen ways, and once the genie is out of the bottle it's difficult to put it back there. There could be also other interactions with particular language features that we haven't though of yet. Due to all the concerns and tricky interactions that we've uncovered, I'd like to see a prototype implementation of this before deciding whether to add this to the PEP -- I have a feeling that we don't yet understand this well enough. |
The situation described by @PetterS is quite widespread in the scientific community. I did such thing several times. If one adds class Expr:
def __eq__(self, other: 'Expr') -> 'Expr':
... then this method will be typed as
def eq_one(x: object) -> bool:
return x == 1
eq_one(Expr()) # error here
Expr() == 'boom!' # error here Both kinds of errors are not caught by mypy now, but will be caught with class Expr(Implementation[object]):
def __eq__(self, other: 'Expr') -> 'Expr':
... Of course I could not judge how widespread are such situations in general. Maybe we could ask others? |
I wanted to add class constructors to the list of issues to clear up in this discussion. At the moment, they receive a confusing treatment:
but
OTOH, class creation passes type check:
The only difference between constructors and other methods is that they affect Liskov substitutability of different objects: constructors affect the classes, while other methods affect the instances of those classes. My preference would be to do what this thread is proposing - let the programmer easily choose whether Liskov constraint (=subtyping) should apply, regardless of subclassing; so BTW, another argument in favor of separation of subytping from subclassing is that substitutability only makes sense if the semantics is compatible -- it's not enough that the methods have matching types; for example:
|
Let's say Currently:
The proposal so far is:
I think for consistency, the second part of the proposal should be:
|
Canyou please use words instead of symbols? I can never remember which way |
Done. To summarize, I think
|
Ah, that summary is very helpful. I agree that one shouldn't call But of course |
I have tried to read through this whole thread and I am maybe missing something, but I think I have a slight variation/use case to share to add to this discussion. If you are familiar with sklearn you know that it provides many various machine learning classes. It is great because many classes share the same interface, so it is easy to programmers to use them. The issue is that while method names are the same, arguments they receive vary greatly. If I would like to define types for them in a meaningful way, there is a problem. The base class might be something like: class Estimator(Generic[Inputs, Outputs]):
@abstractmethod
def fit(self, *, inputs: Inputs, outputs: Outputs) -> None:
pass
@abstractmethod
def predict(self, *, inputs: Inputs) -> Outputs:
pass
class SupervisedEstimator(Estimator[Inputs, Outputs]):
pass
class UnsupervisedEstimator(Estimator[Inputs, Outputs]):
@abstractmethod
def fit(self, *, inputs: Inputs) -> None:
pass
class GeneratorEstimator(Estimator[List[None], Outputs]):
@abstractmethod
def fit(self, *, outputs: Outputs) -> None:
pass
class LupiSupervisedEstimator(SupervisedEstimator[Inputs, Outputs]):
@abstractmethod
def fit(self, *, inputs: Inputs, privileged_inputs: Inputs, outputs: Outputs) -> None:
pass So, somehow the idea is that It seems this is not possible to express with current Python typing so that mypy would not complain? |
@mitar Can you give a more complete example which illustrates the errors generated by mypy? |
I can up with an example that seems to work: from typing import *
T = TypeVar('T')
class Base:
fit: Callable
class Foo(Base):
def fit(self, arg1: int) -> Optional[str]:
pass
class Bar(Foo):
def fit(self, arg1: float) -> str:
pass Maybe this pattern helps? This seems to be about the best you can do given that you don't want the base class to specify the method signatures. |
Interesting. I never thought of just specifying My ideal typing semantics would be: this is a method signature for Sadly, Then consumers of those objects could check and say: does this method accepts |
I've come across a simple case which hasn't been mentioned yet. Variadic arguments can be overridden without (technically) violating the Liskov principle. For example: class A:
def foo(self, *args: Any, **kwargs: Any) -> int:
...
class B(A):
def foo(self, *args: Any, bar: int, **kwargs: Any) -> int:
... is (to my understanding) in principle valid, but still gives an The reason I want to do this is: Suppose a library defines the following classes class A:
def foo(self, *args: Any, **kwargs: Any) -> int:
raise NotImplementedError
class B:
def __init__(self, obj: A) -> None:
self.obj = obj
def get_foo(self) -> int:
return obj.foo() And user code could be something like the following: class MyA(A):
def foo(self, *args: Any, bar: int, **kwargs: Any) -> int:
return bar
class MyB(B):
def __init__(self, obj: MyA) -> None:
super().__init__(obj)
def get_foo(self) -> int:
return obj.foo(bar=2) Importantly, MyA and MyB could both be defined in different third-party libraries. The original library just defines the general interface and basic functionality. Should this example be accepted by the type-checker? If not, how could it be type annotated? |
Your example is actually unsafe, because a call to |
@JelleZijlstra that's true. I guess then a better solution would be using class A:
foo: Callable[..., int] But of course this still means mypy won't flag something like |
Declaring the methods/attributes in mixins is fine if there are just a few of them, but you don't really want to do that when there are a lot of them, for example a mixins for Having class MyTestMixin:
assertEqual: Callable
assertRaises: Callable
... # 10 other asserts I used in the mixin is not really a way I want to go. |
Try with: from typing import Type, TYPE_CHECKING, TypeVar
T = TypeVar('T')
def with_typehint(baseclass: Type[T]) -> Type[T]:
"""
Useful function to make mixins with baseclass typehint
```
class ReadonlyMixin(with_typehint(BaseAdmin))):
...
```
"""
if TYPE_CHECKING:
return baseclass
return object Example tested in Pyright: class ReadOnlyInlineMixin(with_typehint(BaseModelAdmin)):
def get_readonly_fields(self,
request: WSGIRequest,
obj: Optional[Model] = None) -> List[str]:
if self.readonly_fields is None:
readonly_fields = []
else:
readonly_fields = self.readonly_fields # self get is typed by baseclass
return self._get_readonly_fields(request, obj) + list(readonly_fields)
def has_change_permission(self,
request: WSGIRequest,
obj: Optional[Model] = None) -> bool:
return (
request.method in ['GET', 'HEAD']
and super().has_change_permission(request, obj) # super is typed by baseclass
)
>>> ReadOnlyAdminMixin.__mro__
(<class 'custom.django.admin.mixins.ReadOnlyAdminMixin'>, <class 'object'>) |
A pattern I've been struggling with is async subclasses of a concrete base class, following the pattern: class Thing:
def count(self) -> int: ...
def name(self): -> str: ...
class AsyncThing(Thing):
def count(self): -> Awaitable[int]: ...
def name(self): -> Awaitable[str]: ... where several methods on the subclass return a wrapper of the same type as the parent (in this case, Awaitable). This feels like it should be easy. It certainly is easy to implement, but I don't understand enough yet. In particular, it seems like WT = TypeVar("T") # 'wrapper' type, may be Null for no wrapping
class Thing(Generic[WT]):
def __init__(self: Thing[Null]): # default: no wrapper
def count(self) -> WT[int]: ... # actually int
def name(self): -> WT[str]: ...
def not_wrapped(self): -> int # always int, not wrapped
def derivative_method(self) -> WT[str]:
"Not overridden in subclass, but should match return type of name"
return self.name()
class AsyncThing(Thing[Awaitable]):
def count(self): -> Awaitable[int]: ...
def name(self): -> Awaitable[str]: ... but you can't use getitem on TypeVars, so it seems they can't actually be used in this way. Derivative methods that didn't previously need implementations, now also need empty implementations, just to get the updated types. Is there any way to accomplish something like this, or is it better to just forego types altogether if one follows this kind of pattern? |
@minrk that pattern is unsafe, because you're violating the Liskov principle. (If you do I'd suggest not using inheritance and making AsyncThing instead an independent class that wraps Thing. You could use An alternative is to just |
Sorry for misunderstanding, I thought that was the point of this Issue: subclassing without preserving Liskov is valid (and in my experience, hugely valuable).
Thanks for the suggestion. Using a HasA wrapper would be significantly more complex because, as you say, I'd have to provide empty implementations of all methods, just to declare the type, when the only concrete benefit would be satisfying mypy. So far, |
This issue comes from python/mypy#1237. I'll try to make a summary of the discussion here
Some one reported an issue with an artificial example (python/mypy#1237 (comment)):
and then with the following stub files (python/mypy#1237 (comment)):
Which mypy currently reports as erroneous:
Argument 1 of "swap" incompatible with supertype "QPixmap"
These were initially was argued to be a correct error because they violate Liskov Substitution Principle (the same case by a change of return type, the second is a covariant argument change). The problem in this scenario is that the actual classes are not meant to be substituted (the first example is actually a wrapper for a non-virtual C++ function so they're not substitutable, and the second aren't supposed to be mixed together, just reuse part of the code). (see python/mypy#1237 (comment))
There was also a suggestion that allow covariant args explicitly (in the same way that Eiffel allows it and adds a runtime check) with a decorator
My proposal instead was add what some dialects of Eiffel call "non-conforming inheritance" ("conforms" is the Eiffel word for what PEP-483 calls "is-consistent-with"). It is essentially a mechanism to use subclassing just as a way to get the implementation from the superclass without creating a subtyping relation between them. See python/mypy#1237 (comment) for a detailed explanation.
The proposal is to haven
Implementation
in typing so you can write:which just defines a
QBitmap
class with all the mothods copied fromQPixmap
, but without making one a subtype of the other. In runtime we can just makeImplementation[QPixmap] == QPixmap
The text was updated successfully, but these errors were encountered: