-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Preserve Python range objects in for loops
#15404
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The `ForLoopOp` always preserved Python `range` objects in the structured form, and had special-purpose visualisation and OQ3 export capabilities to ensure that these were handled in a structured way end-to-end. While the initial Rust port preserved the Python type hints (`range` _is_ an `Iterable[int]`), it didn't preserve all the intended behaviour. This commit includes the setup to allow the for-loop collection to be more dynamically typed as well.
|
One or more of the following people are relevant to this code:
|
180fe7f to
0b2a36c
Compare
0b2a36c to
7d0fba2
Compare
Pull Request Test Coverage Report for Build 19829846885Details
💛 - Coveralls |
raynelfss
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't fully looked at the code but there are two questions on my mind as I'm looking.
| /// A literal Python range extracted to a Rust object. | ||
| /// | ||
| /// This is separate to `PyO3`'s `PyRange` type, since that keeps everything internally safe for | ||
| /// subclassing and modification the Python heap. | ||
| #[derive(Clone, Copy, Debug, PartialEq, Eq)] | ||
| pub struct PyRange { | ||
| pub start: isize, | ||
| pub stop: isize, | ||
| pub step: NonZero<isize>, | ||
| } | ||
| impl PyRange { | ||
| pub fn is_empty(&self) -> bool { | ||
| let step = self.step.unsigned_abs().get(); | ||
| let diff = self.start.abs_diff(self.stop); | ||
| (self.step.get() > 0 && self.start < self.stop | ||
| || (self.step.get() < 0 && self.start > self.stop)) | ||
| && diff >= step | ||
| } | ||
| pub fn len(&self) -> usize { | ||
| let step = self.step.unsigned_abs().get(); | ||
| let diff = self.start.abs_diff(self.stop); | ||
| if (self.step.get() > 0 && self.start < self.stop) | ||
| || (self.step.get() < 0 && self.start > self.stop) | ||
| { | ||
| // The `diff-1` is guaranteed safe because the `start < stop` or `start > stop` | ||
| // conditions guarantee that `diff` is at least 1. | ||
| 1 + (diff - 1) / step | ||
| } else { | ||
| 0 | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm... This struct looks very similar to parts of SequenceIndex an enum struct you designed during #12669. Would you be able to repurpose that struct instead of creating a new one for this? Or if not, should you repurpose that one to use the one offered here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really - a slice and a range are different Python objects, although they share some passing similarities. Slices can have None in various places, and have different semantics on negative indices.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that, in which case since SequenceIndex uses Ranges and this type consolidates them into one. Should we replace the PosRange and NegRange variants to use this instead??
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't feel to me that unifying PosRange and NegRange into PyRange would actually make things easier; the reason PosRange and NegRange are already split from each other is that handling steps that can be either positive or negative is already very awkward. The semantics of the ranges within SliceIndex are slightly different too, because of how the end-points work (particularly with regard to negative steps).
Fwiw, I don't expect this PyRange object to live forever - we will almost certainly rework how for-loop collections work at some point.
| /// Possible specifications of the "collection" that a for loop iterates over. | ||
| #[derive(Clone, Debug, IntoPyObject, IntoPyObjectRef, FromPyObject, PartialEq, Eq)] | ||
| pub enum ForCollection { | ||
| /// A literal Python `range` object extracted to Rust. | ||
| PyRange(PyRange), | ||
| /// Some ordered collection of integers. | ||
| List(Vec<usize>), | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This here looks very similar to PySequenceIndex which you introduced in #12669. The only difference I see here is that you're not using the int variant? Or would we be able to repurpose it to work wit this one.
|
I’m a bit confused about should |
raynelfss
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have any strong feelings on this addition. Although, I believe the naming here is a bit misleading and we should consider renaming it to something more rust friendly.
| /// A literal Python range extracted to a Rust object. | ||
| /// | ||
| /// This is separate to `PyO3`'s `PyRange` type, since that keeps everything internally safe for | ||
| /// subclassing and modification the Python heap. | ||
| #[derive(Clone, Copy, Debug, PartialEq, Eq)] | ||
| pub struct PyRange { | ||
| pub start: isize, | ||
| pub stop: isize, | ||
| pub step: NonZero<isize>, | ||
| } | ||
| impl PyRange { | ||
| pub fn is_empty(&self) -> bool { | ||
| let step = self.step.unsigned_abs().get(); | ||
| let diff = self.start.abs_diff(self.stop); | ||
| (self.step.get() > 0 && self.start < self.stop | ||
| || (self.step.get() < 0 && self.start > self.stop)) | ||
| && diff >= step | ||
| } | ||
| pub fn len(&self) -> usize { | ||
| let step = self.step.unsigned_abs().get(); | ||
| let diff = self.start.abs_diff(self.stop); | ||
| if (self.step.get() > 0 && self.start < self.stop) | ||
| || (self.step.get() < 0 && self.start > self.stop) | ||
| { | ||
| // The `diff-1` is guaranteed safe because the `start < stop` or `start > stop` | ||
| // conditions guarantee that `diff` is at least 1. | ||
| 1 + (diff - 1) / step | ||
| } else { | ||
| 0 | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that, in which case since SequenceIndex uses Ranges and this type consolidates them into one. Should we replace the PosRange and NegRange variants to use this instead??
| /// This is separate to `PyO3`'s `PyRange` type, since that keeps everything internally safe for | ||
| /// subclassing and modification the Python heap. | ||
| #[derive(Clone, Copy, Debug, PartialEq, Eq)] | ||
| pub struct PyRange { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we consider changing the name of the struct to something that doesn't have Py in the name? even if this struct comes from Python, it doesn't keep any Python tokens or pointers. If anything we can change the name to something along the lines of:
pub struct IndexRange {Or something like that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This object here is PyRange because it's a very literal description of a Python range object. We expect that the collection of a ForLoop will expand to involve more runtime ranges / collections, which actually would be pure Rust and unrelated to Python built-ins, including something like Range { start: Expr, stop: Expr, step: Option<Expr> }, which is why I was leaving naming space for it.
Debasmita: sorry, I didn't reply to your comment properly before. |
The
ForLoopOpalways preserved Pythonrangeobjects in the structured form, and had special-purpose visualisation and OQ3 export capabilities to ensure that these were handled in a structured way end-to-end. While the initial Rust port preserved the Python type hints (rangeis anIterable[int]), it didn't preserve all the intended behaviour.This commit includes the setup to allow the for-loop collection to be more dynamically typed as well.
Summary
Details and comments
Fix #15380.
No additional tests because the existing tests would already have caught this, except that we don't run them because of #15402. No release note because the bug isn't released.