[red-knot] feat: Introduce `Truthy` and `Falsy` to allow more precise typing #13665

Slyces · 2024-10-07T15:56:05Z

Summary

This is a PR trying to address #13632.

Reminder of the problem: when evaluating boolean operations, we could improve precision of types

reveal_type(str_instance() and 8)  # Current: `str | Literal[8]` -- Expected `Literal["", 8]`

The idea is that in some contexts, we can narrow a type to the subset of instances that are truthy (boo(instance) == True) or the subset of instances that are falsy(boo(instance) == False).

As @carljm suggested, a very generic way to express this would be to implement two new types, Truthy and Falsy, and express "the subset of A instances that evaluate to True" as A & Truthy (and conversely for False and Falsy).

Interface

On a high level, this should be fairly straightforward to use when required

Introduce two new types Truthy and Falsy
Convenience method to create A & Truthy (and falsy): IntersectionBuilder::build_truthy(a_ty)

Interesting Cases

Here's a list of interesting cases that might be unlocked by this more precise type knowledge. Some of them are more or less complicated to handle and maybe shouldn't be supported.

Behaviour of Falsy and Truthy outside of intersections

While they only make sense in intersections and probably should only exist in that context, we do need to implement the behaviour of Falsy and Truthy outside of intersections, as they're a Type variant.
Suggestions are welcome if I didn't handle that part properly, I don't have specific opinions or feedback on this part of the implementation.

Consistency of intersections

As it's been raised in this comment by @AlexWaygood, the existence of Truthy and Falsy coupled with having both positive (must be) and negative (must not be) elements in intersection leads to a question:

Do we need both truthy and falsy
X & Falsy = X & ~Truthy

Currently, as this PR explores a first attempt with Truthy and Falsy, we need to make sure that only one of the above two representations is used.

Defined falsy and truthy subsets

Some types have a known subset of instances that are falsy:

bool → False
int → 0
str → ""
bytes → b''
tuple → (,)
list → [], dict → {}, set → set() - we can't really represent that at the moment

Conversely for truthy (most types in Python are truthy, so this is rarely defined)

bool -> True

This could be used in a few ways (that I can think of)

Simplify int & Falsy to Literal[0] (we most probably want that)
Simplify int & Truthy | Literal[0] to int
```
# Case 1
reveal_type(int_instance() or 0)  # Should be `int != 0 | 0` → `int`
```
- Note: that's a more advanced case of X & Truthy | X & Falsy → X (which we should support)
Unify int & ~0 (where ~0 is Literal[0] in the negatives) to int & Truthy
- This is not really a "simplification" as both say the same thing, but we need consistency
x1 | x2 where x1 is X & Truthy and x2 is X & Falsy
- This is what we're doing with BooleanLiteral(true), BooleanLiteral(false) and bool
- Not coded as I can't think of any other type that has both a known set of truthy and falsy values

So far, all subsets I can think of have been singletons (no UnionType). This means that my code might maybe either be too simple (can't handle non-singleton) or too complicated (tries to handle non singleton) at times.

I guess that if we want to include this knowledge in the feature, knowing if non-singleton cases should be supported (or if they even exist) would help.

Simplifying `X & Falsy` / `X & Truthy`

I think that's a fairly straightforward feature. Some types (e.g. FunctionType) have a constant Truthiness (for FunctionType.bool() → Truthiness::AlwaysTrue), so associating them with Truthy or Falsy can be simplified.

X & Truthy → X if x.bool() == Truthiness::AlwaysTrue
X & Falsy → Never if x.bool() == Truthiness::AlwaysTrue
... (same for AlwaysFalse)

Overall, we should probably guarantee at build time that you can't have both AlwaysTrue and AlwaysFalse elements in an intersection, because elements of this intersection would need to have both properties, and that's not possible.

Test Plan

Added unit tests in the builder.rs for intersection/union cases
- Overall, all features written about above should have one or more dedicated tests
Added some "integration tests" in infer.rs through boolean chained operations (currently the only way to create X & Truthy or X & Falsy intersections)
Edited existing tests with a more precise type inference
Solved a random test's // TODO
- Literal[0] | None | Literal[1] & None became Literal[0] | None with this PR which was the expected behaviour
- Note that this should have been solved by other features eventually

We could eventually add some additional integration tests that would be "easier to read" by adding support for the following snippet

a = A()
reveal_type(a)  # type: `A`
if A():
   reveal_type(a)  # type: `A & Truthy`

Concerns & Feedback on the current implementation

I think this feature is very exciting. It would allow us to represent types with more precision, which is always nice.

I do have some concerns with my implementation as it is now

Taken individually, most cases are easy to follow
But it feels like I "hardcorded" many corner cases in the code
- Especially the InnerIntersectionBuilder::simplify method
I think that I can't really picture the kind of intersections we'll work with in practice, so I might either not test enough or test things that can't happen
Performance might not be optimal in some areas

As always feedback is welcome, don't hesitate to question the entire direction I took to achieve this, I'll be happy to refine this with whatever we can learn from this first iteration.

…tion

github-actions · 2024-10-07T16:10:04Z

`ruff-ecosystem` results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

…union

carljm · 2024-10-08T05:17:35Z

Thanks for the PR, this is very cool! I've started a review, but there's a lot to consider here, and I have some other pressing things to take care of, so it may be a couple days before I can complete the review.

AlexWaygood

Thank you! I'll echo Carl that this is pretty cool!!

Carl's probably better placed to do a full review here, in particular for the logic you added in builder.rs -- but here's some things I noticed, in the meantime: