-
-
Notifications
You must be signed in to change notification settings - Fork 5.6k
Add sortperm with dims arg for AbstractArray, fixes #16273 #45211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add sortperm with dims arg for AbstractArray, fixes #16273 #45211
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contributions! It will be nice to close this hole in the sorting interface.
While we no longer have sortrows and sortcolumns, we do have sortslices, so to close #9258, we would need sortperm for slices.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added several small style suggestions, but I also have a larger question of whether we can utilize sort!(ix, order=..., dims=dims)
without sacrificing efficiency?
Thanks for catching all those!
Yep! it seems if we use your change to benchmarksfunction sortperm_new(A::AbstractArray; dims::Integer)
Rpre = CartesianIndices(size(A)[1:dims-1])
Rpost = CartesianIndices(size(A)[dims+1:end])
ix = Base.copymutable(LinearIndices(A))
ord = Base.Perm(
Base.ord(Base.isless, identity, nothing, Base.Order.Forward),
vec(reshape(A, (:, 1)))
)
_sortperm_new!(ix, A, Rpre, dims, Rpost, ord)
return ix
end
@inline function _sortperm_new!(ix::AbstractArray, A::AbstractArray, Rpre, dims, Rpost, order)
for Ipost in Rpost, Ipre in Rpre
sort!(view(ix, Ipre, :, Ipost); order)
end
end
function sortperm_no_loop(A::AbstractArray; dims::Integer)
ix = Base.copymutable(LinearIndices(A))
order = Base.Perm(
Base.ord(Base.isless, identity, nothing, Base.Order.Forward),
vec(reshape(A, (:, 1)))
)
sort!_new(ix; dims, order)
return ix
end
function sort!_new(A::AbstractArray;
dims::Integer,
alg::Base.Algorithm=Base.Sort.defalg(A),
lt=isless,
by=identity,
rev::Union{Bool,Nothing}=nothing,
order::Base.Ordering=Base.Forward) where {K}
_sort!(A, Val(dims), alg, Base.ord(lt, by, rev, order))
end
function _sort!(A::AbstractArray, ::Val{K}, alg::Base.Algorithm, order::Base.Ordering) where {K}
nd = ndims(A)
1 <= K <= nd || throw(ArgumentError("dimension out of range"))
remdims = ntuple(i -> i == K ? 1 : axes(A, i), nd)
for idx in CartesianIndices(remdims)
Av = view(A, ntuple(i -> i == K ? Colon() : idx[i], nd)...)
sort!(Av, alg, order)
end
A
end
using BenchmarkTools
function test()
for dims in 1:3
a = rand(10, 10, 10)
@info "dims = $dims"
@info "PR"
@btime sortperm_new($a; dims=$dims)
@info "w/ sort!"
@btime sortperm_no_loop($a; dims=$dims)
perm_1 = sortperm_new(a; dims=dims)
perm_2 = sortperm_no_loop(a; dims=dims)
@assert perm_1 == perm_2
end
end
test() |
Thanks for catching all those!
Yes it seems if we use your change to benchmarksfunction sortperm_new(A::AbstractArray; dims::Integer)
Rpre = CartesianIndices(size(A)[1:dims-1])
Rpost = CartesianIndices(size(A)[dims+1:end])
ix = Base.copymutable(LinearIndices(A))
ord = Base.Perm(
Base.ord(Base.isless, identity, nothing, Base.Order.Forward),
vec(reshape(A, (:, 1)))
)
_sortperm_new!(ix, A, Rpre, dims, Rpost, ord)
return ix
end
@inline function _sortperm_new!(ix::AbstractArray, A::AbstractArray, Rpre, dims, Rpost, order)
for Ipost in Rpost, Ipre in Rpre
sort!(view(ix, Ipre, :, Ipost); order)
end
end
function sortperm_no_loop(A::AbstractArray; dims::Integer)
ix = Base.copymutable(LinearIndices(A))
order = Base.Perm(
Base.ord(Base.isless, identity, nothing, Base.Order.Forward),
vec(reshape(A, (:, 1)))
)
sort!_new(ix; dims, order)
return ix
end
function sort!_new(A::AbstractArray;
dims::Integer,
alg::Base.Algorithm=Base.Sort.defalg(A),
lt=isless,
by=identity,
rev::Union{Bool,Nothing}=nothing,
order::Base.Ordering=Base.Forward) where {K}
_sort!(A, Val(dims), alg, Base.ord(lt, by, rev, order))
end
function _sort!(A::AbstractArray, ::Val{K}, alg::Base.Algorithm, order::Base.Ordering) where {K}
nd = ndims(A)
1 <= K <= nd || throw(ArgumentError("dimension out of range"))
remdims = ntuple(i -> i == K ? 1 : axes(A, i), nd)
for idx in CartesianIndices(remdims)
Av = view(A, ntuple(i -> i == K ? Colon() : idx[i], nd)...)
sort!(Av, alg, order)
end
A
end
using BenchmarkTools
function test()
for dims in 1:3
a = rand(10, 10, 10)
@info "dims = $dims"
@info "PR"
@btime sortperm_new($a; dims=$dims)
@info "w/ sort!"
@btime sortperm_no_loop($a; dims=$dims)
perm_1 = sortperm_new(a; dims=dims)
perm_2 = sortperm_no_loop(a; dims=dims)
@assert perm_1 == perm_2
end
end
test() |
Weird,
Can you reproduce that problem by any chance? |
93920e1
to
1ed2373
Compare
Moving the function _sort!(A::AbstractArray, alg::Algorithm, order::Ordering, ::Val{K}) where K fixes the issue, and it builds now. I'll open an issue about that when I get the chance. |
Exports from Base aren't automatically available in Sort.jl due to bootstrapping. You need to add |
Ah I did not know that. Thanks. It's odd that it worked with |
I think the ci build still failed. |
fe6c480
to
ed09ca8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks nice!
And very general. I wonder if it would be possible to merge this function with the sortperm(v::AbstractVector; kw...)
function and handle the AbstractVector
case and the AbstractArray
case simultaneously.
Would it also be convenient to add sortperm!
with dims
arg here or is that out of scope?
I think we would have to refactor
Yep added. Thanks for the other comments as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise we need to have two cases which each call sort! with and without dims depending on the type of the input.
Perhaps we could accept dims
as a vararg kw...
, error if (A <: AbstractVector) == (dims in keys(kw))
, and pass kw...
on to sort!
Hi @pcjentsch, I'm going away for the next couple of weeks so I won't be able to continue with this until early June. Feel free to ping oscardssmith, nalimilan, vtjnash, or mbauman if you want someone to review it before then. |
thanks for letting me know Lilith, @oscardssmith I would appreciate a review if you have the chance! |
The implementation looks good. There is some formatting changes that I'm not sure how I feel about, but I don't really care. |
what else is needed for merge here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much for your perseverance, @pcjentsch!
Sorry I was away from this PR for so long, thanks for the reminders, @pcjentsch and @oscardssmith. The functionality looks good to me too!
I'm not a fan of some of the formatting changes, but also don't really care.
One thing that should be changed before merging are the docstrings: I recommend revising the existing docstrings for sortperm[!](v::AbstractVector...)
rather than replacing them with the docstrings for sortperm[!](A::AbstractArray...)
.
You will also need to resolve the merge conflicts since these files have changed since you opened this PR. Let me know if you have any confusion about that, or feel free to just guess and I can review the merge.
b710728
to
8cae75a
Compare
8cae75a
to
f8966c8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Just some minor formatting issues now.
base/sort.jl
Outdated
|
||
# Examples | ||
```jldoctest | ||
julia> v = [3, 1, 2]; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should keep these newlines to match REPL formatting. I care about style here because it is public facing documentation.
Co-authored-by: Lilith Orion Hafner <60898866+LilithHafner@users.noreply.github.com>
Aside from the new merge conflicts (which I believe should be straightforward), some style changes I'm mixed about but that I also don't care enough about to object to, and the freebsd64 CI fail which looks unrelated, this looks good to me! |
Alright, merge conflicts sorted. I guess "style changes" refers to the spacing between operators? I reverted those. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are primarily what I meant by "style changes", but I now realize that we also need to pass through workspace
which is more than a style change.
Co-authored-by: Lilith Orion Hafner <60898866+LilithHafner@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Thanks, @pcjentsch!
The apple test received a cancelation signal and the win32 test got connection reset by peer. Rerunning failed builds just to be sure. |
Oops! Sorry about that. |
Update: I can't figure out how to rerun the builds/get CI to pass cleanly, but I'm pretty sure the failures do not have to do with this PR. @oscardssmith, can you confirm? |
the CI failures do look unrelated. I think we can merge this. |
I agree that we can merge this. |
Thanks, @pcjentsch! |
cool, thank you @oscardssmith and especially @LilithHafner for your thorough reviews! |
This PR adds a
dims
argument tosortperm
, mirroring the API forsort
, and closing issue #16273.Given an
A::AbstractArray
, the function satisfiesA[sortperm(A, dims = dim)] == sort(A, dims = dim)
.Credit to @timholy, I used his idea and blog post in this implementation. I replaced the now-deprecated
sub2ind
withLinearIndices
, which is just as fast (benchmarked againstBase._sub2ind
).In that issue, Steven G. Johnson also suggests a special case for
dims=1
, but my benchmarks suggest the CartesianIndices approach is now faster.The related issue #9258 can also be closed I think, as
sortrows
andsortcols
are no longer in Base.