Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default sorting algorithm should be a stable sort #433

Open
rakudrama opened this issue Nov 12, 2011 · 17 comments
Open

Default sorting algorithm should be a stable sort #433

rakudrama opened this issue Nov 12, 2011 · 17 comments
Labels
area-core-library SDK core library issues (core, async, ...); use area-vm or area-web for platform specific libraries. library-collection type-enhancement A request for a change that isn't a bug

Comments

@rakudrama
Copy link
Member

The documentation on List.sort does not specify if the sort is stable.
The implementation appears to be a variant of quicksort, which is not stable.
In my experience you want a stable sort. Either you can't tell the difference (e.g. sorting numbers) or you want a consistent result (e.g. sorting records in displayed table).

We should be making it easy for developers to build correct apps.
If people really want the extra few % in speed from a non-stable (unstable?) sort, make that the explicit choice.

@rakudrama
Copy link
Member Author

Removed Type-Defect label.
Added Type-Enhancement label.

@DartBot
Copy link

DartBot commented Nov 13, 2011

This comment was originally written by @ahmetaa


hint: Timsort

@DartBot
Copy link

DartBot commented Nov 1, 2013

This comment was originally written by Samuel.H...@gmail.com


This issue is open for a long time, will Dart provide stable sort or not?

@floitschG
Copy link
Contributor

We would like Dart to have a stable sort. The reason it doesn't have high priority is that we also want to use reduce code size when compiling to JavaScript and reuse JavaScript's sorting algorithm.
Unfortunately JS' algorithm is not specified to be stable (and afaik is not in practice).
If we changed Dart's algorithm to be stable there would be a discrepancy between Dart and Dart-Js code. It's not as if we didn't already have some of them, but in general we try to avoid them.
That said: there is nothing preventing us from changing Dart's algorithm to become stable at a later point (with or without a complying Dart-Js version). This is, why I expect this bug to stay open for some time.

@DartBot
Copy link

DartBot commented Nov 2, 2013

This comment was originally written by Samuel.Hap...@gmail.com


Thank you for the clarification!

Wouldn't it be possible to add another method called stableSort, that
would be stable in both Dart and Js?

It's implementation could be very simple:

List.stableSort(cmp) {
  for (var i = 0; i < length; i++) {
    this[i] = [this[i], i];
  }
  this.sort((a, b) {
    var result = cmp(a[0], b[0]);
    if (result != 0) {
      return result;
    } else {
      return a[1] - b[1];
    }
  });
  for (var i = 0; i < length; i++) {
    this[i] = this[i][0];
  }
}

Yes, it would be little bit slower, but not much and this can be
mentioned in documentation.

@floitschG
Copy link
Contributor

Eventually we want to change the standard sort to a stablesort. At that point it wouldn't make sense to have a second stableSort method anymore.

Until then it is simpler to just use a stable sort from a package.

@DartBot
Copy link

DartBot commented Nov 7, 2013

This comment was originally written by Samuel.Hapak...@gmail.com


But if I understand it correctly, you are not going to change it to stablesort, because javascript's sort isn't stable. Or am I missing something?

@floitschG
Copy link
Contributor

It depends on the number of affected users. Currently every user is shipped a dart2js version of the Dart program. Once/If that number drops low enough we can make the default stable and ship the sorting algorithm with the JS output. Only few users would be affected, then.

@DartBot
Copy link

DartBot commented Nov 8, 2013

This comment was originally written by Samuel.Ha...@gmail.com


I understand. However, the moment when the Dart will be supported in all
major browsers is still few years ahead of us.

Why won't support stableSort now and deprecate it few years later?

@floitschG
Copy link
Contributor

Because the work-around isn't that scarring: import stableSort from a package and use the static function: stableSort(list).

@DartBot
Copy link

DartBot commented Nov 8, 2013

This comment was originally written by Samuel....@gmail.com


Ok, and which package do you recommend?

@floitschG
Copy link
Contributor

@lrhn
Copy link
Member

lrhn commented Jan 22, 2014

The merge sort implementation has been moved to package:collection (i.e., import "package:collection/algorithms.dart" for mergeSort).

We have no current plans to change the default sorting algorithm, or to require it to be stable. One reason is indeed that it would preclude compiling to JavaScript and using a non-stable built-in sort in JS.


Added NotPlanned label.

@rakudrama
Copy link
Member Author

Eleven years later I think it is time to reconsider.

The fact that the default sort is not stable is an ongoing productivity tax.
I just spent two days debugging an issue that was, at its core, a failure to correctly compensate for an unstable sort.

The default sort should be stable because that improves developer productivity:

  • Naive users don't need to understand sorting stability.
  • Sophisticated users know that stable and unstable comparison sorts can both be executed in time O(N log N) and the constant factor is not all that different. The burden should be on the sophisticated user to choose an unstable sort to eke out a 20% gain rather than on everybody to worry about whether the sort compare-function makes the sort stable.
  • Test driven development rarely picks up on whether there is a sort stability issue since many comparison sorts use a stable sort for small inputs (In our case when N <= 32 a stable insertion-sort is used).

In the many years since our original discussion, JavaScript has standardized sort as stable, so that is no longer a reason to keep the Dart sort unstable (although and Dart on the web doesn't use the JavaScript version).

In other parts of our ecosystem we worry about sorting being stable (e.g. #42137).
Sort stability is a persistent problem and we should lift that worry from our user's shoulders.

@gvankeerberghen
Copy link

In case it helps people who are not sure of the implications or people who want to import and use mergeSort today, here is quick and telling example IMHO

import 'package:collection/collection.dart';

void main() {
  int n = 35;

  List<Offer> offers1 = List.generate(
    n,
    (index) => Offer(index % 2, index + 1),
  );

  List<Offer> offers2 = List.generate(
    n,
    (index) => Offer(index % 2, index + 1),
  );

  print('Original list:');
  for (var offer in offers1) {
    print('{ statusOrder: ${offer.statusOrder} - original order: ${offer.order} }');
  }
  print('---------');

  mergeSort(offers1, compare: (a, b) => a.statusOrder.compareTo(b.statusOrder));
  print('Sorted by status with mergeSort:');
  for (var offer in offers1) {
    print('{ statusOrder: ${offer.statusOrder} - original order: ${offer.order} }');
  }
  print('---------');

  offers2.sort((a, b) => a.statusOrder.compareTo(b.statusOrder));

  print('Sorted by status with List.sort:');
  for (var offer in offers2) {
    print('{ statusOrder: ${offer.statusOrder} - original order: ${offer.order} }');
  }
  print('---------');
}

class Offer {
  final int statusOrder;
  final int order;

  Offer(this.statusOrder, this.order);
}
# pubspec.yaml
name: sort_stability_check
description: A sample command-line application.
version: 1.0.0

environment:
  sdk: '>=2.19.2 <3.0.0'

dependencies:
  collection: ^1.17.1

dev_dependencies:
  lints: ^2.0.0
  test: ^1.21.0
$ dart sort_stability.dart 
Original list:
{ statusOrder: 0 - original order: 1 }
{ statusOrder: 1 - original order: 2 }
{ statusOrder: 0 - original order: 3 }
{ statusOrder: 1 - original order: 4 }
{ statusOrder: 0 - original order: 5 }
{ statusOrder: 1 - original order: 6 }
{ statusOrder: 0 - original order: 7 }
{ statusOrder: 1 - original order: 8 }
{ statusOrder: 0 - original order: 9 }
{ statusOrder: 1 - original order: 10 }
{ statusOrder: 0 - original order: 11 }
{ statusOrder: 1 - original order: 12 }
{ statusOrder: 0 - original order: 13 }
{ statusOrder: 1 - original order: 14 }
{ statusOrder: 0 - original order: 15 }
{ statusOrder: 1 - original order: 16 }
{ statusOrder: 0 - original order: 17 }
{ statusOrder: 1 - original order: 18 }
{ statusOrder: 0 - original order: 19 }
{ statusOrder: 1 - original order: 20 }
{ statusOrder: 0 - original order: 21 }
{ statusOrder: 1 - original order: 22 }
{ statusOrder: 0 - original order: 23 }
{ statusOrder: 1 - original order: 24 }
{ statusOrder: 0 - original order: 25 }
{ statusOrder: 1 - original order: 26 }
{ statusOrder: 0 - original order: 27 }
{ statusOrder: 1 - original order: 28 }
{ statusOrder: 0 - original order: 29 }
{ statusOrder: 1 - original order: 30 }
{ statusOrder: 0 - original order: 31 }
{ statusOrder: 1 - original order: 32 }
{ statusOrder: 0 - original order: 33 }
{ statusOrder: 1 - original order: 34 }
{ statusOrder: 0 - original order: 35 }
---------
Sorted by status with mergeSort:
{ statusOrder: 0 - original order: 1 }
{ statusOrder: 0 - original order: 3 }
{ statusOrder: 0 - original order: 5 }
{ statusOrder: 0 - original order: 7 }
{ statusOrder: 0 - original order: 9 }
{ statusOrder: 0 - original order: 11 }
{ statusOrder: 0 - original order: 13 }
{ statusOrder: 0 - original order: 15 }
{ statusOrder: 0 - original order: 17 }
{ statusOrder: 0 - original order: 19 }
{ statusOrder: 0 - original order: 21 }
{ statusOrder: 0 - original order: 23 }
{ statusOrder: 0 - original order: 25 }
{ statusOrder: 0 - original order: 27 }
{ statusOrder: 0 - original order: 29 }
{ statusOrder: 0 - original order: 31 }
{ statusOrder: 0 - original order: 33 }
{ statusOrder: 0 - original order: 35 }
{ statusOrder: 1 - original order: 2 }
{ statusOrder: 1 - original order: 4 }
{ statusOrder: 1 - original order: 6 }
{ statusOrder: 1 - original order: 8 }
{ statusOrder: 1 - original order: 10 }
{ statusOrder: 1 - original order: 12 }
{ statusOrder: 1 - original order: 14 }
{ statusOrder: 1 - original order: 16 }
{ statusOrder: 1 - original order: 18 }
{ statusOrder: 1 - original order: 20 }
{ statusOrder: 1 - original order: 22 }
{ statusOrder: 1 - original order: 24 }
{ statusOrder: 1 - original order: 26 }
{ statusOrder: 1 - original order: 28 }
{ statusOrder: 1 - original order: 30 }
{ statusOrder: 1 - original order: 32 }
{ statusOrder: 1 - original order: 34 }
---------
Sorted by status with List.sort:
{ statusOrder: 0 - original order: 23 }
{ statusOrder: 0 - original order: 3 }
{ statusOrder: 0 - original order: 5 }
{ statusOrder: 0 - original order: 13 }
{ statusOrder: 0 - original order: 7 }
{ statusOrder: 0 - original order: 9 }
{ statusOrder: 0 - original order: 11 }
{ statusOrder: 0 - original order: 1 }
{ statusOrder: 0 - original order: 15 }
{ statusOrder: 0 - original order: 17 }
{ statusOrder: 0 - original order: 33 }
{ statusOrder: 0 - original order: 25 }
{ statusOrder: 0 - original order: 29 }
{ statusOrder: 0 - original order: 35 }
{ statusOrder: 0 - original order: 31 }
{ statusOrder: 0 - original order: 21 }
{ statusOrder: 0 - original order: 27 }
{ statusOrder: 0 - original order: 19 }
{ statusOrder: 1 - original order: 6 }
{ statusOrder: 1 - original order: 20 }
{ statusOrder: 1 - original order: 16 }
{ statusOrder: 1 - original order: 22 }
{ statusOrder: 1 - original order: 14 }
{ statusOrder: 1 - original order: 24 }
{ statusOrder: 1 - original order: 12 }
{ statusOrder: 1 - original order: 26 }
{ statusOrder: 1 - original order: 10 }
{ statusOrder: 1 - original order: 28 }
{ statusOrder: 1 - original order: 8 }
{ statusOrder: 1 - original order: 30 }
{ statusOrder: 1 - original order: 4 }
{ statusOrder: 1 - original order: 32 }
{ statusOrder: 1 - original order: 2 }
{ statusOrder: 1 - original order: 34 }
{ statusOrder: 1 - original order: 18 }
---------

@lrhn
Copy link
Member

lrhn commented Nov 17, 2023

Merge Sort is the stable sort we're providing, in package:collection since it didn't need to be directly in the platform libraries.

We would probably not have made sort a member of the List interface if we added it today, because it's precisely not something that each list should need to implement itself. They all use the same implementation inherited from ListBase, so it could just have been an extension method.

Or, better, you'd choose your sorting algorithm depending on the characteristics of the problem. Sometimes you should minimize comparisons, other times moves.
Sometimes being stable is important, sometimes it is not.
One algorithm doesn't fit all.

But we do have a default sort, and it's not stable. It's fast and memory efficient, which is also great, just not of you want stable for some reason.

We could change the default algorithm.
But it'll have other tradeoffs then.

Or we could add a sortStable extension method. Should be discoverable, and you can choose it if you want it.

@rakudrama
Copy link
Member Author

Thought experiment: If the default sort was stable but used some extra storage, how many developers would import an in-place sort that was, say, 30% faster, but not stable?

(I would expect very few, and much later in the development cycle, when they have identified a performance bottleneck rather than up-front when working on correctness.)

Merge Sort is the stable sort we're providing, in package:collection since it didn't need to be directly in the platform libraries.

We would probably not have made sort a member of the List interface if we added it today, because it's precisely not something that each list should need to implement itself. They all use the same implementation inherited from ListBase, so it could just have been an extension method.

Binding to the specific List class has benefits.

It allowed dart2js to not have all Lists all use the same implementation. The dart2js version uses JavaScript's sort when the input is an ordinary List, which turned out to be 2x-3x faster than the version written in Dart (and as a bonus is stable!).

Or, better, you'd choose your sorting algorithm depending on the characteristics of the problem. Sometimes you should minimize comparisons, other times moves.
Sometimes being stable is important, sometimes it is not.
One algorithm doesn't fit all.

I think about stability an order of magnitude more often than comparisons vs moves.

I have inspected the compiled output of many large apps (Flutter and ACX). I see:

  • mergeSort and the default sort are both present in all apps.
  • They are called with the same API (i.e. mergeSort's nice facility to use a key selector and to sort a subrange go unused).

This indicates to me that if the default was stable, we would have one implementation of sorting in every app instead of two, improving the footprint of Dart apps. If every app needs a stable sort somewhere, that should be reused where any sort will do.

But we do have a default sort, and it's not stable. It's fast and memory efficient, which is also great, just not of you want stable for some reason.

It is not all that fast in practice. Indexing operations in the inner loops can't be devirtualized. Our mergeSort is similarly hobbled, so the default sort is probably faster than mergeSort. But I think it might be possible to write a stable sort that is faster in practice than the current default sort.

We could change the default algorithm. But it'll have other tradeoffs then.

Or we could add a sortStable extension method. Should be discoverable, and you can choose it if you want it.

Or make sort stable, and then you usually don't have to make a decision at all.

@lrhn lrhn added library-collection and removed closed-not-planned Closed as we don't intend to take action on the reported issue labels Nov 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-core-library SDK core library issues (core, async, ...); use area-vm or area-web for platform specific libraries. library-collection type-enhancement A request for a change that isn't a bug
Projects
None yet
Development

No branches or pull requests

6 participants