Skip to content

Better alternative for algebraic data types #2464

Open
@JukkaL

Description

@JukkaL

Many languages support algebraic data types, including Haskell, OCaml, Scala, Swift and Rust. Many programmers like them and they have some benefits over Python subclassing, which I won't discuss in detail here.

Currently we can kind of fake algebraic data types by using union types and named tuples:

class Circle:
    radius: float

class Rectangle:
    width: float
    height: float

Shape = Union[Circle, Rectangle]

def area(s: Shape) -> float:
    if isinstance(s, Circle):
        result = math.pi * s.radius**2
    elif isinstance(s, Rectangle):
        result = s.width * s.height
    return result

There are a few problems with this currently:

  1. If you forget to handle a case, mypy often won't complain about this. Catching these errors is often mentioned as one of the nicest things about algebraic data types.
  2. You need to write each item type name twice, once in the class definition and once in the union type definition. This is somewhat error-prone.
  3. The union type definition needs to come after the item types, which feels backwards to me. (If we fixed forward references to types in type aliases, it might help a bit.)
  4. Mypy doesn't keep track of the name of the union type alias internally, and in error messages it will just print out the entire union. This could be awkward since the union can have many items.

For (1), mypy could detect at least some undefined variables and recognize (some) if statements used for "pattern matching". For example, assume that we added a new shape, Triangle. Now mypy should complain about area(), as it fails to handle triangles (if we'd have used multiple returns, mypy could already have caught the error):

def area(s: Shape) -> float:
    if isinstance(s, Circle):
        result = math.pi * s.radius**2
    elif isinstance(s, Rectangle):
        result = s.width * s.height
    return result   # Error: "result" may be undefined (union item "Triangle" not handled)

For (2), (3) and (4), we could have some new syntax:

from mypy_extensions import NamedTupleUnion

class Shape(NamedTupleUnion):  # This would be roughly similar to original union type
    pass

class Circle(Shape):
    radius: float

class Rectangle(Shape):
    width: float
    height: float

NamedTupleUnion would have a few special features. All subclasses are named tuples and must be defined in the same module as the NamedTupleUnion type (i.e. it can't be extended outside the current module). This way we can check whether all item types are handled using isinstance checks. Finally, only one level of inheritance would be supported, for simplicity -- though I'm not sure if this restriction is necessary.

[This is just a random idea I wanted to write down and it's not urgent in any way.]

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions