Skip to content

Commit e14e0a6

Browse files
committed
RFC: Add the group_by and group_by_mut methods to slice
1 parent 74d4623 commit e14e0a6

File tree

1 file changed

+101
-0
lines changed

1 file changed

+101
-0
lines changed

text/0000-group-by.md

Lines changed: 101 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,101 @@
1+
- Feature Name: group_by
2+
- Start Date: 2018-06-15
3+
- RFC PR:
4+
- Rust Issue:
5+
6+
# Summary
7+
[summary]: #summary
8+
9+
Provide an `Iterator` over a slice that produce non-overlapping runs of elements separated by a given predicate.
10+
11+
# Motivation
12+
[motivation]: #motivation
13+
14+
Adding this `Iterator` to the standard library will help people split slices by using a custom predicate!
15+
This `Iterator` is implemented on generic slices to provide performances and flexibility, `GroupBy` implements `DoubleEndedIterator` without any overhead and it does not need any allocation.
16+
17+
There is a similar method that already exists in [the standard library called `split`](https://doc.rust-lang.org/std/primitive.slice.html#method.split) but it will remove the element that does the separation.
18+
This behavior is not always wanted and could have been achieved by using `group_by` skipping the first element of each groups but the first.
19+
20+
In short it is a more generic `split` method that cover more use cases.
21+
22+
Here is a loop that return the first element of each group based on the equality predicate:
23+
24+
```rust
25+
let mut previous = None;
26+
let mut iter = slice.iter();
27+
while let Some(elem) = iter.next() {
28+
if previous.is_none() || previous != Some(elem) {
29+
previous = Some(elem);
30+
31+
// do something here with `elem`: the first element of each group
32+
}
33+
}
34+
```
35+
36+
Using the `GroupBy` `Iterator` here return all the elements which are in the same group, it gives a slice of a complete group with less boilerplate:
37+
38+
```rust
39+
for group in slice.group_by(|a, b| a == b) {
40+
// do something here with the `group` slice
41+
}
42+
```
43+
44+
# Guide-level explanation
45+
[guide-level-explanation]: #guide-level-explanation
46+
47+
If you want to split a slice into groups of elements you can use the `GroupBy` `Iterator`. It provides you the ability to specify if two elements that follow each other must be in the same group or not, if the predicate you specify returns `false` so the slice must be split at this point and a new group is returned to the user. A group is no more than a slice of the base slice.
48+
49+
```rust
50+
struct Human {
51+
age: u32,
52+
is_cool: bool,
53+
}
54+
55+
let slice = /* a slice of humans */;
56+
57+
// we first group humans by coolness
58+
for coolness_group in slice.group_by(|a, b| a.is_cool == b.is_cool) {
59+
// and we then group humans by age
60+
for age_group in coolness_group.group_by(|a, b| a.age == b.age) {
61+
// ...
62+
}
63+
}
64+
```
65+
66+
# Reference-level explanation
67+
[reference-level-explanation]: #reference-level-explanation
68+
69+
[A basic implementation is available](http://github.com/Kerollmops/group-by). Note that it implement `DoubleEndedIterator` and so the `next_back` and the `rev` methods.
70+
71+
The implementation that is specified here is only available on slices, the reason is because it is less efficient to do that on any possible `Iterator`, much less optimizations are available to us with simple `Iterator`. It will probably be painful to implement `DoubleEndedIterator` on it.
72+
73+
# Drawbacks
74+
[drawbacks]: #drawbacks
75+
76+
It will add a new type to the slice and it will make the standard library grow.
77+
78+
# Rationale and alternatives
79+
[alternatives]: #alternatives
80+
81+
The current design will make no real overhead compared to one based only on generic `Iterator`s, it does not need allocation at all. The `GroupBy` `Iterator` will have a friend named `GrouByMut` and both will provide a `remainder` method ([following the same borrowing rules has the `ExactChunks/ExactChunksMut`](https://github.com/rust-lang/rust/pull/51339)) that will give the remaining elements.
82+
83+
[The generic implementation on `Iterator` has been tested](https://git.phaazon.net/phaazon/group-by-rs/src/commit/3d3c6d80c02f1813ecc001b110a90392899d0f68) and performances are not here compared to the slice based one.
84+
85+
# Prior art
86+
[prior-art]: #prior-art
87+
88+
This is a useful function that is already present in most of the other language libraries (e.g. [Haskell has `groupBy`](http://hackage.haskell.org/package/base-4.11.1.0/docs/Data-List.html#v:groupBy]).
89+
90+
The good thing that Haskell provide in relation with the `groupBy` function is a `group` function for elements that implement `Eq`. The same behavior can be achieved:
91+
92+
```rust
93+
fn group_by_eq<T: Eq>(slice: &[T]) -> impl Iterator<Item=&[T]> {
94+
GrouBy::new(slice, PartialEq::eq)
95+
}
96+
```
97+
98+
# Unresolved questions
99+
[unresolved]: #unresolved-questions
100+
101+
In the standard library, when two implementation are near the same, macros are used to remove code duplication, we will need to declare a macro for `GroupBy` and `GroupByMut` that will be generic over the pointer type used (e.g. `*const T` and `*mut T`).

0 commit comments

Comments
 (0)