Skip to content

Commit

Permalink
PEP 653: Split __match_kind__ into __match_container__ and __match_cl…
Browse files Browse the repository at this point in the history
…ass__ (#1901)
  • Loading branch information
markshannon authored Mar 30, 2021
1 parent 31e30ae commit 0a0e7a3
Showing 1 changed file with 96 additions and 61 deletions.
157 changes: 96 additions & 61 deletions pep-0653.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,9 @@ Abstract
This PEP proposes a semantics for pattern matching that respects the general concept of PEP 634,
but is more precise, easier to reason about, and should be faster.

The object model will be extended with a special (dunder) attribute, ``__match_kind__``,
in addition to the ``__match_args__`` attribute from PEP 634, to support pattern matching.
The ``__match_kind__`` attribute must be an integer.
The object model will be extended with two special (dunder) attributes, ``__match_container__`` and
``__match_class__``, in addition to the ``__match_args__`` attribute from PEP 634, to support pattern matching.
Both of these new attributes must be integers and ``__match_args__`` is required to be a tuple.

With this PEP:

Expand Down Expand Up @@ -97,45 +97,52 @@ A match statement performs a sequence of pattern matches. In general, matching a
2. When deconstructed, does the value match this particular pattern?
3. Is the guard true?

To determine whether a value can match a particular kind of pattern, we add the ``__match_kind__`` attribute.
This allows the kind of a value to be determined once and in a efficient fashion.
To determine whether a value can match a particular kind of pattern, we add the ``__match_container__``
and ``__match_class__`` attributes.
This allows the kind of a value to be determined in a efficient fashion.

Specification
=============


Additions to the object model
-----------------------------

A ``__match_kind__`` attribute will be added to ``object``.
It should be overridden by classes that want to match mapping or sequence patterns,
or want change the default behavior when matching class patterns.
It must be an integer and should be exactly one of these::
The ``__match_container__ ``and ``__match_class__`` attributes will be added to ``object``.
``__match_container__`` should be overridden by classes that want to match mapping or sequence patterns.
``__match_class__`` should be overridden by classes that want to change the default behavior when matching class patterns.

``__match_container__`` must be an integer and should be exactly one of these::

0
MATCH_SEQUENCE
MATCH_MAPPING

bitwise ``or``\ ed with exactly one of these::
``__match_class__`` must be an integer and should be exactly one of these::

0
MATCH_DEFAULT
MATCH_ATTRIBUTES
MATCH_SELF

.. note::
It does not matter what the actual values are. We will refer to them by name only.
Symbolic constants will be provided both for Python and C, and once defined they will
never be changed.

Classes inheriting from ``object`` will inherit ``__match_kind__ = MATCH_DEFAULT`` and ``__match_args__ = ()``
``object`` will have the following values for the special attributes::

__match_container__ = 0
__match_class__= MATCH_ATTRIBUTES
__match_args__ = ()

These special attributes will be inherited as normal.

If ``__match_args__`` is overridden, then it is required to hold a tuple of strings. It may be empty.

.. note::
``__match_args__`` will be automatically generated for dataclasses and named tuples, as specified in PEP 634.

The pattern matching implementation is *not* required to check that ``__match_args__`` behaves as specified.
If the value of ``__match_args__`` is not as specified, then
The pattern matching implementation is *not* required to check that any of these attributes behave as specified.
If the value of ``__match_container__``, ``__match_class__`` or ``__match_args__`` is not as specified, then
the implementation may raise any exception, or match the wrong pattern.
Of course, implementations are free to check these properties and provide meaningful error messages if they can do so efficiently.

Expand Down Expand Up @@ -163,14 +170,13 @@ All additional code listed below that is not present in the original source will
Preamble
''''''''

Before any patterns are matched, the expression being matched is evaluated and its kind is determined::
Before any patterns are matched, the expression being matched is evaluated::

match expr:

translates to::

$value = expr
$kind = type($value).__match_kind__

Capture patterns
''''''''''''''''
Expand Down Expand Up @@ -234,6 +240,7 @@ A pattern not including a star pattern::

translates to::

$kind = type($value).__match_container__
if $kind & MATCH_SEQUENCE == 0:
FAIL
if len($value) != len($VARS):
Expand All @@ -248,6 +255,7 @@ A pattern including a star pattern::

translates to::

$kind = type($value).__match_container__
if $kind & MATCH_SEQUENCE == 0:
FAIL
if len($value) < len($VARS):
Expand All @@ -265,6 +273,7 @@ A pattern not including a double-star pattern::

translates to::

$kind = type($value).__match_container__
if $kind & MATCH_MAPPING == 0:
FAIL
if not $value.keys() >= $KEYWORD_PATTERNS.keys():
Expand All @@ -281,6 +290,7 @@ A pattern including a double-star pattern::

translates to::

$kind = type($value).__match_container__
if $kind & MATCH_MAPPING == 0:
FAIL
if not $value.keys() >= $KEYWORD_PATTERNS.keys():
Expand Down Expand Up @@ -308,7 +318,7 @@ translates to::
.. note::

``case ClsName():`` is the only class pattern that can succeed if
``($kind & (MATCH_SELF|MATCH_DEFAULT)) == 0``
``($kind & (MATCH_SELF|MATCH_ATTRIBUTES)) == 0``


Class pattern with a single positional pattern::
Expand All @@ -317,6 +327,7 @@ Class pattern with a single positional pattern::

translates to::

$kind = type($value).__match_class__
if $kind & MATCH_SELF:
if not isinstance($value, ClsName):
FAIL
Expand All @@ -333,7 +344,8 @@ translates to::

if not isinstance($value, ClsName):
FAIL
if $kind & MATCH_DEFAULT:
$kind = type($value).__match_class__
if $kind & MATCH_ATTRIBUTES:
$attrs = ClsName.__match_args__
if len($attr) < len($VARS):
raise TypeError(...)
Expand All @@ -355,7 +367,8 @@ translates to::

if not isinstance($value, ClsName):
FAIL
if $kind & MATCH_DEFAULT:
$kind = type($value).__match_class__
if $kind & MATCH_ATTRIBUTES:
try:
for $KEYWORD in $KEYWORD_PATTERNS:
$tmp = getattr($value, QUOTE($KEYWORD))
Expand All @@ -375,7 +388,8 @@ translates to::

if not isinstance($value, ClsName):
FAIL
if $kind & MATCH_DEFAULT:
$kind = type($value).__match_class__
if $kind & MATCH_ATTRIBUTES:
$attrs = ClsName.__match_args__
if len($attr) < len($VARS):
raise TypeError(...)
Expand Down Expand Up @@ -408,6 +422,7 @@ For example, the pattern::

translates to::

$kind = type($value).__match_class__
if $kind & MATCH_SEQUENCE == 0:
FAIL
if len($value) != 2:
Expand All @@ -433,45 +448,49 @@ translates to::
FAIL


Non-conforming ``__match_kind__``
Non-conforming special attributes
'''''''''''''''''''''''''''''''''

All classes should ensure that the the value of ``__match_kind__`` follows the specification.
All classes should ensure that the the values of ``__match_container__``, ``__match_class__``
and ``__match_args__`` follow the specification.
Therefore, implementations can assume, without checking, that the following are true::

(__match_kind__ & (MATCH_SEQUENCE | MATCH_MAPPING)) != (MATCH_SEQUENCE | MATCH_MAPPING)
(__match_kind__ & (MATCH_SELF | MATCH_DEFAULT)) != (MATCH_SELF | MATCH_DEFAULT)
(__match_container__ & (MATCH_SEQUENCE | MATCH_MAPPING)) != (MATCH_SEQUENCE | MATCH_MAPPING)
(__match_class__ & (MATCH_SELF | MATCH_ATTRIBUTES)) != (MATCH_SELF | MATCH_ATTRIBUTES)

Thus, implementations can assume that ``__match_kind__ & MATCH_SEQUENCE`` implies ``(__match_kind__ & MATCH_MAPPING) == 0``, and vice-versa.
Likewise for ``MATCH_SELF`` and ``MATCH_DEFAULT``.
Thus, implementations can assume that ``__match_container__ & MATCH_SEQUENCE`` implies ``(__match_container__ & MATCH_MAPPING) == 0``, and vice-versa.
Likewise for ``__match_class__``, ``MATCH_SELF`` and ``MATCH_ATTRIBUTES``.

If ``__match_kind__`` does not follow the specification,
then implementations may treat any of the expressions of the form ``$kind & MATCH_...`` above as having any value.
Values of the special attributes for classes in the standard library
--------------------------------------------------------------------

Implementation of ``__match_kind__`` in the standard library
------------------------------------------------------------
For the core builtin container classes ``__match_container__`` will be:

``object.__match_kind__`` will be ``MATCH_DEFAULT``.
* ``list``: ``MATCH_SEQUENCE``
* ``tuple``: ``MATCH_SEQUENCE``
* ``dict``: ``MATCH_MAPPING``
* ``bytearray``: 0
* ``bytes``: 0
* ``str``: 0

For common builtin classes ``__match_kind__`` will be:
Named tuples will have ``__match_container__`` set to ``MATCH_SEQUENCE``.

* ``bool``: ``MATCH_SELF``
* ``bytearray``: ``MATCH_SELF``
* ``bytes``: ``MATCH_SELF``
* ``float``: ``MATCH_SELF``
* ``frozenset``: ``MATCH_SELF``
* ``int``: ``MATCH_SELF``
* ``set``: ``MATCH_SELF``
* ``str``: ``MATCH_SELF``
* ``list``: ``MATCH_SEQUENCE | MATCH_SELF``
* ``tuple``: ``MATCH_SEQUENCE | MATCH_SELF``
* ``dict``: ``MATCH_MAPPING | MATCH_SELF``
* All other standard library classes for which ``issubclass(cls, collections.abc.Mapping)`` is true will have ``__match_container__`` set to ``MATCH_MAPPING``.
* All other standard library classes for which ``issubclass(cls, collections.abc.Sequence)`` is true will have ``__match_container__`` set to ``MATCH_SEQUENCE``.

Named tuples will have ``__match_kind__`` set to ``MATCH_SEQUENCE | MATCH_DEFAULT``.

* All other standard library classes for which ``issubclass(cls, collections.abc.Mapping)`` is true will have ``__match_kind__`` set to ``MATCH_MAPPING``.
* All other standard library classes for which ``issubclass(cls, collections.abc.Sequence)`` is true will have ``__match_kind__`` set to ``MATCH_SEQUENCE``.
For the following builtin classes ``__match_class__`` will be set to ``MATCH_SELF``:

* ``bool``
* ``bytearray``
* ``bytes``
* ``float``
* ``frozenset``
* ``int``
* ``set``
* ``str``
* ``list``
* ``tuple``
* ``dict``

Legal optimizations
-------------------
Expand All @@ -497,9 +516,9 @@ Implementations are allowed to make the following assumptions:

* ``isinstance(obj, cls)`` can be freely replaced with ``issubclass(type(obj), cls)`` and vice-versa.
* ``isinstance(obj, cls)`` will always return the same result for any ``(obj, cls)`` pair and repeated calls can thus be elided.
* Reading ``__match_args__`` and calling ``__deconstruct__`` are pure operations, and may be cached.
* Sequences, that is any class for which ``MATCH_SEQUENCE`` is true, are not modified by iteration, subscripting or calls to ``len()``,
and thus those operations can be freely substituted for each other where they would be equivalent when applied to an immuable sequence.
* Reading any of ``__match_container__``, ``__match_class__`` or ``__match_args__`` is a pure operation, and may be cached.
* Sequences, that is any class for which ``__match_container__&MATCH_SEQUENCE`` is not zero, are not modified by iteration, subscripting or calls to ``len()``.
Consequently, those operations can be freely substituted for each other where they would be equivalent when applied to an immutable sequence.

In fact, implementations are encouraged to make these assumptions, as it is likely to result in signficantly better performance.

Expand Down Expand Up @@ -631,9 +650,11 @@ Summary of differences between this PEP and PEP 634

The changes to the semantics can be summarized as:

* Selecting the kind of pattern uses ``cls.__match_kind__`` instead of
``issubclass(cls, collections.abc.Mapping)`` and ``issubclass(cls, collections.abc.Sequence)``
and allows classes a bit more control over which kinds of pattern they match.
* Requires ``__match_args__`` to be a *tuple* of strings, not just a sequence.
This make pattern matching a bit more robust and optimizable as ``__match_args__`` can be assumed to be immutable.
* Selecting the kind of container patterns that can be matched uses ``cls.__match_container__`` instead of
``issubclass(cls, collections.abc.Mapping)`` and ``issubclass(cls, collections.abc.Sequence)``.
* Allows classes to opt out of deconstruction altogether, if neccessary, but setting ``__match_class__ = 0``.
* The behavior when matching patterns is more precisely defined, but is otherwise unchanged.

There are no changes to syntax. All examples given in the PEP 636 tutorial should continue to work as they do now.
Expand All @@ -644,7 +665,7 @@ Rejected Ideas
Using attributes from the instance's dictionary
-----------------------------------------------

An earlier version of this PEP only used attributes from the instance's dictionary when matching a class pattern with ``__match_kind__ == MATCH_DEFAULT``.
An earlier version of this PEP only used attributes from the instance's dictionary when matching a class pattern with ``MATCH_ATTRIBUTES``.
The intent was to avoid capturing bound-methods and other synthetic attributes. However, this also mean that properties were ignored.

For the class::
Expand All @@ -659,7 +680,7 @@ For the class::
...

Ideally we would match the attributes "a" and "p", but not "m".
However, there is no general way to do that, so this PEP now follows the semantics of PEP 634 for ``MATCH_DEFAULT``.
However, there is no general way to do that, so this PEP now follows the semantics of PEP 634 for ``MATCH_ATTRIBUTES``.

Lookup of ``__match_args__`` on the subject not the pattern
-----------------------------------------------------------
Expand All @@ -672,6 +693,13 @@ This has been rejected for a few reasons::
* Using the class specified in the pattern has the potential to provide better error reporting is some cases.
* Neither approach is perfect, both have odd corner cases. Keeping the status quo minimizes disruption.

Combining ``__match_class__`` and ``__match_container__`` into a single value
-----------------------------------------------------------------------------

An earlier version of this PEP combined ``__match_class__`` and ``__match_container__`` into a single value, ``__match_kind__``.
Using a single value has a small advantage in terms of performance,
but is likely to result in unintended changes to container matching when overriding class matching behavior, and vice versa.

Deferred Ideas
==============

Expand Down Expand Up @@ -706,7 +734,7 @@ Code examples
::

class Symbol:
__match_kind__ = MATCH_SELF
__match_class__ = MATCH_SELF

.. [2]
Expand All @@ -716,6 +744,7 @@ This::

translates to::

$kind = type($value).__match_container__
if $kind & MATCH_SEQUENCE == 0:
FAIL
if len($value) != 2:
Expand All @@ -732,6 +761,7 @@ This::

translates to::

$kind = type($value).__match_container__
if $kind & MATCH_SEQUENCE == 0:
FAIL
if len($value) < 2:
Expand All @@ -746,6 +776,7 @@ This::

translates to::

$kind = type($value).__match_container__
if $kind & MATCH_MAPPING == 0:
FAIL
if $value.keys() != {"x", "y"}:
Expand All @@ -763,6 +794,7 @@ This::

translates to::

$kind = type($value).__match_container__
if $kind & MATCH_MAPPING == 0:
FAIL
if not $value.keys() >= {"x", "y"}:
Expand All @@ -782,7 +814,8 @@ translates to::

if not isinstance($value, ClsName):
FAIL
if $kind & MATCH_DEFAULT:
$kind = type($value).__match_class__
if $kind & MATCH_ATTRIBUTES:
$attrs = ClsName.__match_args__
if len($attr) < 2:
FAIL
Expand All @@ -804,7 +837,8 @@ translates to::

if not isinstance($value, ClsName):
FAIL
lif $kind & MATCH_DEFAULT:
$kind = type($value).__match_class__
lif $kind & MATCH_ATTRIBUTES:
try:
x = $value.a
y = $value.b
Expand All @@ -824,7 +858,8 @@ translates to::

if not isinstance($value, ClsName):
FAIL
if $kind & MATCH_DEFAULT:
$kind = type($value).__match_class__
if $kind & MATCH_ATTRIBUTES:
$attrs = ClsName.__match_args__
if len($attr) < 1:
raise TypeError(...)
Expand All @@ -844,7 +879,7 @@ translates to::
::

class Basic:
__match_kind__ = MATCH_POSITIONAL
__match_class__ = MATCH_POSITIONAL
def __deconstruct__(self):
return self._args

Expand Down

0 comments on commit 0a0e7a3

Please sign in to comment.