An issue with the accepted definition of Union

The [text in PEP 483 that describes how unions behave](https://peps.python.org/pep-0483/#fundamental-building-blocks) actually doesn't do a very good job as a *definition* of a union type.

> - Union[t1, t2, …]. Types that are subtype of at least one of t1 etc. are subtypes of this.
>   - Unions whose components are all subtypes of t1 etc. are subtypes of this. Example: Union[int, str] is a subtype of Union[int, float, str].
>   - The order of the arguments doesn’t matter. Example: Union[int, str] == Union[str, int].
>   - If ti is itself a Union the result is flattened. Example: Union[int, Union[float, str]] == Union[int, float, str].
>   - If ti and tj have a subtype relationship, the less specific type survives. Example: Union[Employee, Manager] == Union[Employee].
>   - Union[t1] returns just t1. Union[] is illegal, so is Union[()]
>   - Corollary: Union[..., object, ...] returns object.

This lists some facts that are indeed true if we were to interpret `Union[A, B]` as a static type that stands for a set of values that is a set-theoretic union of the sets that `A` and `B` stand for. However this does not actually *define* `Union[A, B]` to be that set.

Many have interpreted the first line: "Types that are subtype of at least one of t1 etc. are subtypes of this." as a definition:

> `C <: A | B` if and only if `C <: A` or `C <: B`

However this is now how set-theoretic unions work, it is not in general true that $C \subseteq A \cup B \iff (C \subseteq A) \lor (C \subseteq B)$
<details><summary>Counterexample:</summary>

```py
>>> A = {1}
>>> B = {2}
>>> C = {1, 2}
>>> C.issubset(A | B) == (C.issubset(A) or C.issubset(B))
False
```
</details>

What people probably are thinking of is the formula $c \in A \cup B \iff (c \in A \lor c \in B)$, which is actually true. But we cannot use this as a definition because that involves checking potentially infinitely many $c$. Instead we want a definition in terms of the `<:` aka $\subseteq$ relation.

This is a solved problem and I'll cut to the chase: the formulas that define set-theoretic unions and intersections are as follows:
$$C \subseteq A \cap B \iff (C \subseteq A \land C \subseteq B)$$
$$C \supseteq A \cup B \iff (C \supseteq A \land C \supseteq B)$$

The duality between intersection and union lies in reversing the subset relation, *not* in replacing an "and" with an "or". In fact it is a well understood idea that transitive relations play extremely poorly with "or", and [extremely well with "and"](https://ncatlab.org/nlab/show/hom-functor+preserves+limits). Hence the above definitions both using "and".

What about $A \cap B \subseteq ...$ and $... \subseteq A \cup B$? We have these formulas, as corollaries of the definitions above:
$$(A \subseteq D \lor B \subseteq D) \implies A \cap B \subseteq D$$
$$(D \subseteq A \lor D \subseteq B) \implies D \subseteq A \cup B$$
<details><summary>Proof</summary>

- Substitute $C = A \cap B$ into the first equation. LHS is true by reflexivity, so RHS must be true too. If we have $A \subseteq D$ then by transitivity $A \cap B \subseteq A \subseteq D$. Likewise if we have $B \subseteq D$. So in total if we have either $A \subseteq D$ or $B \subseteq D$ then we have $A \cap B \subseteq D$.
- Same proof but with all relations reversed. Substitute $C = A \cup B$ into the second equation, RHS must be true. If $D \subseteq A$ then by reflexivity $D \subseteq A \subseteq A \cup B$. Same for $B$.
</details>
Note that these are one-way implications. Sufficient but not necessary conditions. There's once again a symmetry with reversing the subset relation, but no symmetry between "and" and "or".

In #1, [here](https://github.com/CarliJoy/intersection_examples/issues/1#issuecomment-1657265867), @erictraut lists all 4 of these on equal footing, labelled "non-controversial". I want to emphasize that two of these involving "and" are actually definitions by [universal property](https://en.wikipedia.org/wiki/Universal_property), and the two involving "or" are special cases/corollaries.

<details><summary>What does the PEP-483 paragraph actually define?</summary>

Most of the bullet list is actually describing a reduction procedure for type expressions involving class literals and unions. The proposed algorithm of checking `C | D <: A | B` with `(C <: A or C <: B) and (D <: A or D <: B)` only works exactly because each of `A`,  `B`, `C`, `D` can only be a class, as the only other possible form - union - gets flattened first. However in presence of intersections this algorithm already fails:

`A & (B | C) <: (A & B) | (A & C)` vs `A & (B | C) <: A & B or A & (B | C) <: A & C`

I'll also cut to the chase here: you want to reduce the left hand side to a *disjunctive* normal form, and the right hand side to a *conjunctive* normal form. The `|` on the left and the `&` on the right can then be taken apart. But the `&` on the left and the `|` on the right cannot. Problems of the kind `t1 & ... <: s1 | ...` where `t1`, ..., `s1`, ... are class literals -- are in general irreducible and will require knowledge of the inheritance tree and whether a multiple inheritance child class exists/can be constructed.
</details>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

An issue with the accepted definition of Union #22

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

An issue with the accepted definition of Union #22

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions