Skip to content

Support deleteat! #12

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Feb 7, 2021
Merged

Support deleteat! #12

merged 10 commits into from
Feb 7, 2021

Conversation

MasonProtter
Copy link
Contributor

@MasonProtter MasonProtter commented Feb 6, 2021

With this change

julia> a = CircularArray([1,2,3]);

julia> deleteat!(a, 5)
2-element CircularVector{Int64, Vector{Int64}}:
 1
 3

julia> b = CircularArray([1,2,3,4]);

julia> deleteat!(b, 1:5:10)
2-element CircularVector{Int64, Vector{Int64}}:
 3
 4

Closes #11

@codecov
Copy link

codecov bot commented Feb 6, 2021

Codecov Report

Merging #12 (04a0781) into master (9192767) will increase coverage by 1.44%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #12      +/-   ##
==========================================
+ Coverage   92.30%   93.75%   +1.44%     
==========================================
  Files           1        1              
  Lines          26       32       +6     
==========================================
+ Hits           24       30       +6     
  Misses          2        2              
Flag Coverage Δ
unittests 93.75% <100.00%> (+1.44%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/CircularArrays.jl 93.75% <100.00%> (+1.44%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9192767...04a0781. Read the comment docs.

@MasonProtter
Copy link
Contributor Author

MasonProtter commented Feb 6, 2021

Some thoughts:

(unique  sort  map)(i -> mod(i, eachindex(a.data)), inds)

can be made much faster for a mutable type inds suchas a vector by writing

(unique!  sort!  map)(i -> mod(i, eachindex(a.data)), inds)

but it'll fail on say a Tuple of inds. If a dependancy on BangBang.jl is permissable, we could write this like

(unique!!  sort!!  map)(i -> mod(i, eachindex(a.data)), inds)

and have the best of both worlds quite easily.

An even heavier dependancy, Trandsducers.jl might be able to fuse these operations in a more efficient manner too.

@Vexatos
Copy link
Owner

Vexatos commented Feb 6, 2021

I was actually planning on implementing that this weekend, thank you very much!

@Vexatos
Copy link
Owner

Vexatos commented Feb 6, 2021

I am not a fan of depending on random packages for functionality like this, especially not with a package this small. Ideally, this package would have no dependencies, the dependency on OffsetArrays.jl currently is only there because there is no other way in the current package system to be incompatible with older versions of it.

@MasonProtter
Copy link
Contributor Author

MasonProtter commented Feb 7, 2021

While I definitely disagree with this attitude, that's fully your prerogative and it's why I wrote this PR with the naive, out-of-place implementation initially.

I can add some methods for collection types that are known to be mutable which take advantage of mutation if you'd like, or can just leave it as is.

end

function Base.deleteat!(a::CircularVector, i::Integer)
deleteat!(a.data, mod(i, eachindex(a.data)))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason not to do this? Would be more consistent with other methods.

Suggested change
deleteat!(a.data, mod(i, eachindex(a.data)))
deleteat!(a.data, mod1(i, length(a.data)))

Copy link
Contributor Author

@MasonProtter MasonProtter Feb 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mine should work regardless of the index scheme a uses, whereas yours would assume a is always indexed from 1:N. Since CircularArray can store any abstract vector, I think it's good to be generic.

I would have included a OffsetArrays.jl testcase for this, but offsetarrays doesn't actually support deleteat! :(

@@ -65,4 +65,14 @@ function Base.showarg(io::IO, arr::CircularArray, toplevel)
# toplevel && print(io, " with eltype ", eltype(arr))
end

function Base.deleteat!(a::CircularVector, inds)
deleteat!(a.data, (unique ∘ sort ∘ map)(i -> mod(i, eachindex(a.data)), inds))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sadly fails if inds is a Tuple since sort isn't implemented for Tuples. It should be sort ∘ unique ∘ map, probably.

Copy link
Owner

@Vexatos Vexatos Feb 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I further managed to slightly improve performance by making this line more consistent with the rest, probably because of the removal of eachindex.

Suggested change
deleteat!(a.data, (unique sort map)(i -> mod(i, eachindex(a.data)), inds))
deleteat!(a.data, sort(unique(mod1.(inds, length(a.data)))))

Copy link
Contributor Author

@MasonProtter MasonProtter Feb 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mhm, that's unfortunate that sort isn't supported for Tuple but makes sense. We could just handle Tuple separately I suppose.

Using unique before sort has it's own problems because we'd end up allocating an array...

julia> unique((1,2,3))
3-element Vector{Int64}:
 1
 2
 3

Could just not do the unique or sort and say it's the user's fault if they give indices that cause an error (this is what Base does, but it's easier to screw up the ordering here due to circularity)

Copy link
Owner

@Vexatos Vexatos Feb 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we do sort! ∘ unique? If unique always returns a mutable array, this should have the same number of allocations as before, or fewer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point!

@MasonProtter
Copy link
Contributor Author

MasonProtter commented Feb 7, 2021

Okay, so now we have an escape hatch for 1 based vectors that'll use mod1(i, length(a)) rather than mod(i, eachindex(a)). This should be totally decidable at compile time and optimized away for one based arrays:

julia> f(v) = firstindex(v) === 1 ? 1 : 2
f (generic function with 1 method)

julia> code_typed(f, (Vector{Int},))
1-element Vector{Any}:
 CodeInfo(
1 ─     Base.arraysize(v, 1)::Int64
└──     return 1
) => Int64

I also switched the order of the unique and sort for now so that Tuples work (added a test case). Allocating the temp is probably fine, since if one is really performance sensitive, they can always bypass it by operating on a.data directly.

@Vexatos
Copy link
Owner

Vexatos commented Feb 7, 2021

Makes me wonder if the shortcut to length is even necessary. Looking at the IR for an example

code_typed(deleteat!, (CircularArray{Int64,1,Array{Int64,1}}, Tuple{Int64,Int64,Int64}))

for two different implementations

function Base.deleteat!(a::CircularVector, inds)
    jnds = firstindex(a) === 1 ? mod1.(inds, length(a)) : map(i -> mod(i, eachindex(a.data)), inds)
    deleteat!(a.data, sort!(unique(jnds)))
    a
end

# ====

function Base.deleteat!(a::CircularVector, inds)
    deleteat!(a.data, sort!(unique(map(i -> mod(i, eachindex(a.data)), inds))))
    a
end

BenchmarkTools produces the same number of allocations and speed for both implementations, so everything appears to get optimized away anyway.

@Vexatos
Copy link
Owner

Vexatos commented Feb 7, 2021

The problem with using eachindex is that it's not guaranteed to work for all array types since, I think, eachindex might not be guaranteed to return an AbstractUnitRange, which mod requires. length is currently used by other parts of this package as well.

eachindex does seem to default to Base.axes1 which is what this package used to use before #10, but only for arrays where IndexStyle(arr) == IndexLinear(). For IndexCartesian(), eachindex appears to produce a CartesianIndices which won't work with mod. This can happen even with one-dimensional arrays.

@Vexatos
Copy link
Owner

Vexatos commented Feb 7, 2021

Testing further, forcing linear indexing with eachindex(IndexLinear(), a) might work. It's still not guaranteed to produce an AbstractUnitRange, though. However, it defaults to axes1 for AbstractVector and Base.OneTo(length(a)) for AbstractArray, and might be better than length.

@MasonProtter
Copy link
Contributor Author

The problem with using eachindex is that it's not guaranteed to work for all array types since, I think, eachindex might not be guaranteed to return an AbstractUnitRange, which mod requires. length is currently used by other parts of this package as well.

This is technically true, but in practice, an T <: AbstractArray for which eachindex(::T) :: StepRange will not be properly supported in a ton of generic code at this point, whereas most people have come to expect offset, UnitRange indexing to be generic.

One thing we could do is just put a note in a docstring or something and say that if eachindex of your array does not give a AbstractUnitRange, you will need to either define mod(x, ::YourRangeType), or define your own deleteat!.

Testing further, forcing linear indexing with eachindex(IndexLinear(), a) might work. It's still not guaranteed to produce an AbstractUnitRange, though. However, it defaults to axes1 for AbstractVector and Base.OneTo(length(a)) for AbstractArray, and might be better than length.

I suspect that if some wacky type is a AbstractVector, but uses IndexCartesian, then there's probably going to be other issues, but yeah, doing eachindex(IndexLinear(), a) sounds like a good idea.

@Vexatos
Copy link
Owner

Vexatos commented Feb 7, 2021

This looks good to me now. I'll merge this and add the IndexLinear myself. Thank you once again for working on it.

@Vexatos Vexatos merged commit fb02f80 into Vexatos:master Feb 7, 2021
@MasonProtter MasonProtter deleted the patch-1 branch February 7, 2021 19:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement deleteat!
2 participants