-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move sorting from base to stdlib #46679
Conversation
5dff493
to
d56e839
Compare
Regarding the bootstrapping sort, could you clarify the claim of comb sort having O(n log n) complexity? Pretty sure you're wrong, assuming you meant worst-case complexity, it seems that your variant (with a geometric sequence of increments) has a quadratic lower bound on the worst case complexity, although better worst-case complexity is possible with other sequences of increments (p. 16): https://homepages.cwi.nl/~paulv/papers/sorting.pdf Perhaps you should fix the commit message and doc-comment? On the other hand, why not just use the standard insertion sort or some Shell sort? |
If the decision is based around ease of implementation, would a heap sort be good for the compiler? I'm pretty sure we already have some stuff in the compiler for manipulating heaps. |
Thanks for sharing the interesting reference. It's indeed not n log n. Please note that the final stage of the algorithm is insertion sort itself, the first stage supposedly improves on it. The main advantage of this algorithm is that it enables vectorization optimizations by the compiler. Complexity might not matter so much in the stated context of small lists. On the other hand, maybe the lists are so small that straight insertion sort might be good enough? I'm not sure @LilithHafner tried that. |
Heap sort nicely guarantees n log n complexity, but the main thing about comb sort is the vectorization, like I said before. Heapsort might not be the best for small lists as well. Also, might not be as easy to implement, but if there are functions available to do it in-place, maybe... Otherwise I'd say either stick to a simpler algorithm or just go ahead with a full-on smoothsort! |
Would someone please help me with getting the docs to build? I'm having trouble figuring it out. |
Can you rebase this branch to master and resolve the conflict? Happy to try help on the doc failure (but maybe it just goes away)? |
The documentation issues are now resolved, it's possible there is a more elegant way, but the current situation is possible and does not require making any changes to I believe all prior comments have been addressed, but this PR has yet to receive a comprehensive review. |
Bump <3 |
imo this looks very good. would it be possible to separate this info 2 PRs, one with all the changes to Core and the other which moves to stdlib? since there are non-trivial algorithm changes there, it would be good to get those looked at by someone familiar with those parts of the compiler. |
What is the motivation for this? #45222 doesn't really explain why this is needed. Will you not be able to sort with vanilla Julia now? I imaging sorting is way more common than taking a mean, and mean is moving back in because the people want it. Also, this branch doesn't build without Sort in the sysimg. |
Sort will stay in the sysimage and The primary motivation from #45222 is to let Sort depend on Random. Another perk of dropping the compiler's dependence on Sort is to let Sort use |
What conceptually is the line between what sorting functionality is available in Base and what would require loading the |
Not sure how different this would be compared to turning something from a stdlib to a normal package. Specially if the API remains the same and no extra imports/usings are needed. |
That is not the case right now though, Julia doesn't even compile ( |
It's quite different since you can access |
I'm not sure why moving it to a stdlib is so necessary (outside of the Random dep). If the issue is bootstrap. Could we move the new sorting code to late in the bootstrap after the other things have been defined and keep the minimal sorting things in the compiler? |
I'm in favor it in principle just because I think that any complex functionality that can be moved out should be moved out. However, it remains unclear if we can actually meaningfully move much sorting functionality out. |
Everything can be done without loading the @fredrikekre, could you explain what you mean by "Julia doesn't even compile"? CI builds fine, and when I run git clone --depth=1 --branch=remove-compiler-sort https://github.com/JuliaLang/julia.git
cd julia
make
./julia
sort(rand(10))
include("stdlib/Sort/test/runtests.jl") everything works fine. |
|
Sorry, I still don't understand. I'm not aware of any plans to remove sorting from the sysimage. |
In general, the expectation is that unless you depend on a stdlib, your package should be usable without that stdlib being present (even in the sysimage). There are currently a few exceptions to that for stdlibs that do type piracy:
It would IMO be good not to introduce more cases like that since it makes filtering out stdlibs unsafe (e.g. via PackageCompiler). |
@LilithHafner Note that currently the compiler sort is broken for large sizes. I'm going to steel the part of this PR that will fix this (i.e. the part that uses heapsort) to make into a separate PR to fix the regression. |
Once sorting and sorting optimizations are modular and extensible, it should be possible to move short integer range optimizations, the Sort.Float submodule, and radix sort out of base. If we are willing to let Should we defer this PR until sorting has gone through that refactoring? |
The only way I can think of to do this is to have a default implementation of Seems like a whole lot of complexity in a roundabout way for not much gain. |
This PR eliminates all three current uses of
sort
in the compiler by merging PRs for the first two and inlining the third use.Then it eliminates all but one use of
sort!
in base and replaces that final use with a use of the new special purposeCore.Compiler.sort!
which is CombSort (thanks, @nlw0), a small, efficient, unstable, n log n sorting algorithm. And finally, it moves base/sort.jl to stdlibs/Sort/src/Sort.jl and the sorting code from base/range.jl into stdlibs/Sort/src/ranges.jl.Finally, it takes advantage of the Sort not participating in bootstrapping and removes some hacks
unsigned
This will allow us to avoid many of the gymnastics in #45222 needed to accommodate bootstrapping. It might also reduce the sysimage/compilecache size a wee bit, but I haven't checked.
Supersedes #45584
Closes #46586
Closes #46587
Closes #46507
Is needed by #45222