@@ -350,11 +350,23 @@ should move backward by 2 bytes before continuing. This ensure searching for `\u
350350
351351## Pattern API
352352
353- This RFC assumes a generalized pattern API which supports more than strings. If the pattern API is
354- not available, the new functions can take `&OsStr` instead of `impl Pattern<&OsStr>`, but this may
355- hurt future compatibility due to inference breakage.
356-
357- Assuming we do want to generalize the Pattern API, the implementor should note the issue of
353+ As of Rust 1.25, we can search a `&str` using a character, a character set or another string,
354+ powered by [RFC #528](https://github.com/rust-lang/rfcs/pull/528) a.k.a. “Pattern API 1.0”.
355+
356+ There are some drafts to generalize this so that we could retain mutability and search in more types
357+ such as `&[T]` and `&OsStr`, as described in various comments
358+ (“[v1.5](https://github.com/rust-lang/rust/issues/27721#issuecomment-185405392)” and
359+ “[v2.0](https://github.com/rust-lang/rfcs/pull/1309#issuecomment-214030263)”). A proper RFC has not
360+ been proposed so far.
361+
362+ This RFC assumes the target of generalizing the Pattern API beyond `&str` is accepted, enabling us
363+ to provide a uniform search API between different types of haystack and needles. However, this RFC
364+ does not rely on a generalized Pattern API. If this RFC is stabilized without a generalized Pattern
365+ API, the new methods described in the [Guide-level explanation][guide-level-explanation] section can
366+ take `&OsStr` instead of `impl Pattern<&OsStr>`, but this may hurt future compatibility due to
367+ inference breakage if generalized Pattern API is indeed implemented.
368+
369+ Assuming we do have a generalized the Pattern API, the implementor should note the issue of
358370splitting a surrogate pair:
359371
3603721. A match which starts with a low surrogate will point to byte 1 of the 4-byte sequence
@@ -363,9 +375,14 @@ splitting a surrogate pair:
363375
364376Implementation should note these different offsets when converting between different kinds of
365377cursors. In the [`omgwtf8::pattern` module](https://docs.rs/omgwtf8/*/omgwtf8/pattern/index.html),
366- this behavior is enforced by using distinct types for the start and end cursors.
378+ based on the “v1.5” draft, this behavior is enforced by using distinct types for the start and end
379+ cursors.
380+
381+ The following outlines the generalized Pattern API which could work for `&OsStr`:
367382
368383```rust
384+ // in module `core::pattern`:
385+
369386pub trait Pattern<H: Haystack>: Sized {
370387 type Searcher: Searcher<H>;
371388 fn into_searcher(self, haystack: H) -> Self::Searcher;
@@ -380,6 +397,13 @@ pub trait Searcher<H: Haystack> {
380397 fn next_reject(&mut self) -> Option<(H::StartCursor, H::EndCursor)>;
381398}
382399
400+ pub trait ReverseSearcher<H: Haystack>: Searcher<H> {
401+ fn next_match_back(&mut self) -> Option<(H::StartCursor, H::EndCursor)>;
402+ fn next_reject_back(&mut self) -> Option<(H::StartCursor, H::EndCursor)>;
403+ }
404+
405+ pub trait DoubleEndedSearcher<H: Haystack>: ReverseSearcher<H> {}
406+
383407// equivalent to SearchPtrs in "Pattern API 1.5"
384408// and PatternHaystack in "Pattern API 2.0"
385409pub trait Haystack: Sized {
@@ -403,7 +427,7 @@ pub trait Haystack: Sized {
403427}
404428```
405429
406- For ` &OsStr ` , we define both ` StartCursor ` and ` EndCursor ` as ` *const u8 ` .
430+ For the ` &OsStr ` haystack , we define both ` StartCursor ` and ` EndCursor ` as ` *const u8 ` .
407431
408432The ` start_to_end_cursor ` function will return ` cur + 2 ` if we find that ` cur ` points to the middle
409433of a 4-byte sequence.
@@ -452,7 +476,7 @@ match self.matcher.next_match() {
452476 ```
453477
454478 As a workaround , we introduced `find_range ` and `match_ranges `. Note that this is already a
455- problem to solve if we want to make `Regex ` a pattern .
479+ problem to solve if we want to make `Regex ` a pattern of strings .
456480
457481# Rationale and alternatives
458482[alternatives ]: #alternatives
0 commit comments