Description
In mypy/maptype.py
, we have a function class_derivation_paths
that finds all the paths up the hierarchy from a derived class to a given ancestor class. The code implicitly assumes that this function will only ever return a single path, for example by silently taking the first element of a list derived from its result and dropping the rest:
return map_instance_to_supertypes(instance, superclass)[0]
And indeed there are a couple of assertions in comments in the code that this function should only ever find a single path:
# FIX: Currently we should only have one supertype per interface, so no
# need to return an array
[...]
# FIX: Currently we might only ever have a single path, so this could be
# simplified
But this doesn't seem to be true! In fact, if we set up the classic diamond inheritance pattern:
A
/ \
B C
\ /
D
then we do in fact get two paths, one through B
and one through C
. Demonstrated by adding an assert and a triggering test case here: gnprice@a85876521
which produces this exception:
AssertionError: [[<TypeInfo __main__.B>, <TypeInfo __main__.A>], [<TypeInfo __main__.C>, <TypeInfo __main__.A>]]
What should our approach to this situation be?
In roughly increasing order of semantic and implementation complexity, some options include
- forbid this situation -- require that there's only a single inheritance path from a given derived class to a given ancestor, and say that code that uses diamond inheritance can't be typed
- allow diamond inheritance only if the ancestor is not generic
- allow diamond inheritance only if all the inheritance paths can be verified, just from the class definitions, to always result in the same instance of the ancestor for any instance of the derived
- allow diamond inheritance only if the actual instances of the derived class that we encounter each result in a single instance of the ancestor across all inheritance paths (this is uncomfortably reminiscent of C++ templates)
- something even more complicated that allows inheriting from several different instances of the same ancestor
I'm inclined to start with option 1, as it's super simple to understand and to implement, and see if that ever poses a problem. Maybe nobody actually uses diamond inheritance in code they want to try to type; it's a pretty tricky pattern in the first place. If we want to do this, we should detect the situation up front (like in verify_base_classes
in semanal.py) and give an appropriate error.
Option 2 is also super simple to implement, if people use diamond inheritance but they don't need generics with it. Somewhat less satisfying as a rule for the user.
I'd be kind of concerned if we find ourselves starting to think we need one of the more complex options.