Add Iterators.first and Iterators.last which don't throw for empty collections#37119
Add Iterators.first and Iterators.last which don't throw for empty collections#37119non-Jedi wants to merge 5 commits into
Conversation
On zulip, there was a thread about the fact that there's not a good
way of finding the first item matching a predicate in a non-indexable
iterable without manually writing out the for-loop. One possibility is
using the existing first function composed with Iterators.filter:
first(Iterators.filter(>(5), 1:10))
The problem with this approach is that it requires wrapping in a try
block if the predicate not being satisfied is a
possibility (Iterators.filter returns an empty collection). There
should be a way of expressing this without using exception-handling as
control flow or writing the for-loop manually.
Ideally, I think first and last should just return
Union{Some{T},Nothing} always, but that would be a breaking
change. So I see two obvious alternatives:
1. This commit adds an optional predicate argument to first and last
which is mapped over the collection and the first/last value
satisfying predicate is returned.
2. Add a Iterators.first/Iterators.last which are identical to the
versions in Base except that when successful they return Some{T}
and when unsuccesful return Nothing
ce131c1 to
39a39e9
Compare
Co-authored-by: Kristoffer Carlsson <kristoffer.carlsson@chalmers.se>
From code review. - Add ref links. - Avoid using |>. - Use , instead of ; between array elements. Co-authored-by: Alex Arslan <ararslan@comcast.net>
There was a problem hiding this comment.
Should the docstrings instead be combined with the existing docstring for first(coll) as e.g.:
"""
first([predicate=_ -> true,] coll)
...
"""Also, I really appreciate the detailed code review. It would be great to get some feedback on the overall concept as well.
mbauman
left a comment
There was a problem hiding this comment.
I'm not fully convinced about the functionality, but could be persuaded.
|
|
||
| Get the first element of `coll` satisfying `predicate` wrapped in [`Some`](@ref). | ||
|
|
||
| If no element of `coll` satisfies `predicate`, return `nothing`. |
There was a problem hiding this comment.
This is interesting — and deviates from the existing first which would throw an error. This gives me a bit of pause, but at a minimum it means that we cannot combine these docstrings:
julia> first(_->true, [])
julia> first([])
ERROR: BoundsError: attempt to access 0-element Array{Any,1} at index [1]There was a problem hiding this comment.
Good point on the docstrings.
| """ | ||
| function first(predicate, itr) | ||
| for x in itr | ||
| predicate(x) && return Some(x) |
There was a problem hiding this comment.
I'm not thrilled about Some here — I don't think we have a single API in base that returns Some. That said, this is unlike the other find/search APIs in that they return indices or RegexMatches or values from some other limited type set that does not include Nothing.
There was a problem hiding this comment.
I think I've convinced myself that at the very least this should be a separate function from first/last. Does Iterators.first (as suggested in OP) sound like a good name to you? In that case, we wouldn't have to create a predicate accepting version as it would compose with Iterators.filter. EDIT: this PR now implements Iterators.first instead.
In reality, I think all APIs that generically return an element of a collection but throw if the element isn't present should instead return Union{Some{T},Nothing}. This includes first, last, getindex, reduce, foldl, etc. This is obviously breaking; I should probably create a "Taking something seriously" issue to get feedback on that idea for julia 2.0.
Instead of creating a version of Base.first and Base.last accepting a predicate, create functions in the Iterators module. This allows easy composition with Iterators.filter without having to accept a predicate argument.
f34bdda to
16a19f9
Compare
|
Could someone educate me on the history of |
|
I'd like to bump this. The core functionality I'm trying to get into If the decision is that a function for getting the first element of a lazily filtered collection isn't necessary in |
|
I like trying to solve the stated problem
|
Not all iterables have meaningful indices (e.g.
I agree on the unsuitability of the |
|
So revisiting this in 2023, I think what I really want is for Maybe.jl (which didn't have If there's not appetite for a change like this, I'm happy to update this PR to rename the function |
|
This PR has had no activity for two years+, so I'll close it as stale. @non-Jedi If you're still interested in working on this PR, let me know, and I can re-open it. |
EDIT: This PR now implements option 2 discussed at the bottom of this post rather than option 1 which it originally implemented.
On zulip, there was a thread about the fact that there's not a good
way of finding the first item matching a predicate in a non-indexable iterable without
manually writing out the
for-loop(https://julialang.zulipchat.com/#narrow/stream/225540-gripes/topic/.60find.60.20in.20iterables). One
possibility is using the existing
firstfunction composed withIterators.filter:The problem with this approach is that it requires wrapping in a
tryblock if the predicate not being satisfied is a
possibility (
Iterators.filterreturns an empty collection). Thereshould be a way of expressing this without using exception-handling as
control flow or writing the
for-loop manually.Ideally, I think first and last should just return
Union{Some{T},Nothing}always, but that would be a breakingchange. So I see two obvious alternatives:
firstandlastwhich is mapped over the collection and the first/last value
satisfying predicate is returned wrapped in
Some.Iterators.first/Iterators.lastwhich are identical to theversions in Base except that when successful they return
Some{T}and when the collection is empty return
Nothing