Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minimal IntervalSet type #193

Merged
merged 31 commits into from
Jun 30, 2022

Conversation

rofinn
Copy link
Member

@rofinn rofinn commented Jun 2, 2022

Currently added a simple IntervalSet type and most tests pass now. I've included example tests with both Vector{<:AbstractInterval} and IntervalSets to identify where they differ as union still works with both.

@codecov
Copy link

codecov bot commented Jun 2, 2022

Codecov Report

Merging #193 (e891f30) into revert-191-revert-179-dfl/set-operations (b6272ec) will decrease coverage by 0.46%.
The diff coverage is 86.07%.

@@                             Coverage Diff                              @@
##           revert-191-revert-179-dfl/set-operations     #193      +/-   ##
============================================================================
- Coverage                                     84.63%   84.16%   -0.47%     
============================================================================
  Files                                            12       12              
  Lines                                           794      840      +46     
============================================================================
+ Hits                                            672      707      +35     
- Misses                                          122      133      +11     
Impacted Files Coverage Δ
src/Intervals.jl 100.00% <ø> (ø)
src/deprecated.jl 14.87% <ø> (ø)
src/interval_sets.jl 91.91% <86.07%> (-3.86%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b6272ec...e891f30. Read the comment docs.

@rofinn rofinn marked this pull request as ready for review June 2, 2022 06:40
@rofinn rofinn requested a review from omus as a code owner June 2, 2022 06:40
@rofinn
Copy link
Member Author

rofinn commented Jun 2, 2022

GitHub actions seems to be borked.

Copy link
Collaborator

@omus omus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass review

src/interval_sets.jl Outdated Show resolved Hide resolved
src/interval_sets.jl Outdated Show resolved Hide resolved
src/interval_sets.jl Show resolved Hide resolved

# TODO: These functions should be compatible with unbounded intervals
if isbounded(earlier) && isbounded(later)
@test union([earlier], [later]) == [expected_superset]
@test intersect([earlier], [later]) == [expected_overlap]
@test union([earlier], [later]) != [expected_superset]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be updated with what is expected

Copy link
Member Author

@rofinn rofinn Jun 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we're just gonna update all these tests then? I'll also note that the union([a], [b]) method was never something that we supported prior to 1.7. I feel like the fact that the majority of this PR involves updating tests suggests they may be too verbose.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I've included both the previous expected values (which we reverted) and the current behaviour (ie: base methods) on one line to make reviewing a bit easier.

Copy link
Contributor

@haberdashPI haberdashPI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be good to add some tests in here for your use case as well

Yeah, this is a bit of a weird edge case where we'd need to test for a behaviour that wasn't defined here. I've become rather skittish of extending functions if there's concern that the new method doesn't align with the original concept/definition in Base.

I'm in agreement with @omus on this: I do see why this behavior is not something we would necessarily want to keep (it isn't really by design and I certainly didn't predict this use), and at some point this behavior should probably be deprecated. But it is the whole reason we have a breaking change that requires a revert of 1.7.0. Might as well be explicit about that in the tests, and then other folks can understand why things are set up this way for now. These tests can be marked as broken / deprecated when the change is actually made.

src/interval_sets.jl Outdated Show resolved Hide resolved
@rofinn
Copy link
Member Author

rofinn commented Jun 2, 2022

Thanks for the review! Folks are welcome to add commits to this branch while I tests these changes within our ecosystem to make sure we aren't missing any other breaking changes.

@omus
Copy link
Collaborator

omus commented Jun 2, 2022

Thanks for the review! Folks are welcome to add commits to this branch while I tests these changes within our ecosystem to make sure we aren't missing any other breaking changes.

Sounds good. I suspect I'll be making some additions next week

@haberdashPI
Copy link
Contributor

I'd be happy to add some more tests to this branch: though I don't think I have the appropriate permissions to push here.

@rofinn
Copy link
Member Author

rofinn commented Jun 2, 2022

Good news! Looks like these changes are no longer breaking for us, at least not in our production codebase. I'm not sure what extra tests we'd want to add as this is now often testing the base fallback behaviour we want to keep, but feel free to make a PR to this branch.

1. Made IntervalSet a subtype of AbstractSet
  - Note about changing internally representation once `union!(Vector{AbstractInterval})` is deprecated
2. Add an extra empty constructor
3. `==` falls back to issetequal
4. Fixed a bug where `union` didn't correctly copy data in the non-concrete case.
5. Updated a bunch of tests to include both the previous (reverted) expectation and the current base fallback behaviour.
Copy link
Member Author

@rofinn rofinn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll leave this up to other folks to decide if they want add more tests.

https://en.wikipedia.org/wiki/Interval_arithmetic#Interval_operators
"""
struct IntervalSet{T<:AbstractInterval} <: AbstractSet{T}
# TODO: Use Dict internally once union!(Vector{AbstractInterval}) is fully deprecated
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In relation to a point made by @omus, we need to wrap a vector without mutating or copying in order to properly deprecated the old union! behaviour. Once that's gone we should be good to use a more appropriate internal type like a Dict.

src/interval_sets.jl Show resolved Hide resolved
test/comparisons.jl Show resolved Hide resolved
@haberdashPI
Copy link
Contributor

haberdashPI commented Jun 17, 2022

I'm not sure I agree with the TODO about using a Dict. I think ultimately, on construction of an IntervalSet, unbunch should be called to immediately represent sets in the way that is most efficient for downstream set operations, which all use mergesets. (At this point find_intersections would probably need refactoring).

@haberdashPI
Copy link
Contributor

I've added some additional proposed changes here:

#196

@haberdashPI
Copy link
Contributor

I made some further changes over in #196, because the internal function unbunch needs to support vectors as well as sets: find_intersections should not accept IntervalSet objects, because the input is ordered, and no intervals should ever be merged, regardless of future optimizations of this kind to IntervalSet.

@@ -427,16 +434,17 @@ Base.intersect(x::IntervalSet, y::IntervalSet) = mergesets((inx, iny) -> inx &&
Base.union(x::IntervalSet, y::IntervalSet) = mergesets((inx, iny) -> inx || iny, x, y)
Base.setdiff(x::IntervalSet, y::IntervalSet) = mergesets((inx, iny) -> inx && !iny, x, y)
Base.symdiff(x::IntervalSet, y::IntervalSet) = mergesets((inx, iny) -> inx ⊻ iny, x, y)
Base.issubset(x::T, y::IntervalSet{<:AbstractInterval{T}}) where {T} = any(i -> x in i, y.items)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this mean that if x doesn't have the same concrete type as the element type of y it will throw a method error? I think issubset(2, [1.0..3.0, 4.0..5.0]) should be true not a method error.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In either case we should probably add a test for this case (around line 50 of tests/sets.jl), either to verify the method error, or check that it works.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. One thing I also noticed is that we aren't testing things like issubset(interval, intervals) and issubset(intervals, interval) which would fallback to calling the base setdiff and erroring. I'll go through and add more tests for the other set operations in case there are other incorrect fallback conditions.

Copy link
Contributor

@haberdashPI haberdashPI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@rofinn
Copy link
Member Author

rofinn commented Jun 27, 2022

@omus Are you good with these changes?

src/interval_sets.jl Outdated Show resolved Hide resolved
println(io, ":")
nrows = displaysize(io)[1]
half = fld(nrows, 2) - 2
if nrows ≥ n && half > 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't looked through the package, dunno if this is standardized in it? If not,

Suggested change
if nrows n && half > 1
if nrows >= n && half > 1

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think is pretty common in this package as it uses a lot of unicode characters.

https://github.com/invenia/Intervals.jl/blob/master/src/interval.jl#L76

src/interval_sets.jl Outdated Show resolved Hide resolved
Co-authored-by: mattBrzezinski <matt.brzezinski@invenia.ca>
@rofinn
Copy link
Member Author

rofinn commented Jun 30, 2022

Alright, I think this has been through a pretty thorough review now. I'm gonna merge and tag this, so folks can start using the new type and give feedback. I think the separate type and explicit deprecations should make it pretty unlikely to be breaking.

@rofinn rofinn merged commit 718ed5a into revert-191-revert-179-dfl/set-operations Jun 30, 2022
@rofinn rofinn deleted the rf/intervalset-type branch June 30, 2022 18:51
@rofinn rofinn mentioned this pull request Jul 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants