Closed
Description
As requested in #185 (comment)
I've previously avoided this for performance reasons: sort-by-column on a column without an index is likely to perform badly for hundreds of thousands of rows.
That's not a good enough reason to avoid the feature entirely though. A few options:
- Allow sort-by-column by default, give users the option to disable it for specific tables/columns
- Disallow sort-by-column by default, give users option (probably in
metadata.json
) to enable it for specific tables/columns - Automatically detect if a column either has an index on it OR a table has less than X rows in it
We already have the mechanism in place to cut off SQL queries that take more than X seconds, so if someone DOES try to sort by a column that's too expensive it won't actually hurt anything - but it would be nice to not show people a "sort" option which is guaranteed to throw a timeout error.
The vast majority of datasette usage that I've seen so far is on smaller datasets where the performance penalties of sort-by-column are extremely unlikely to show up.
Still left to do:
- UI that shows which sort order is currently being applied (in HTML and in JSON)
- UI for applying a sort order (with rel=nofollow to avoid Google crawling it)
- Sort column names should be escaped correctly in generated SQL
- Validation that the selected sort order is a valid column
- Throw error if user attempts to apply _sort AND _sort_desc at the same time
- Ability to disable sorting (or sort only for specific columns) in metadata.json
- Fix "201 rows where sorted by sortable_with_nulls " bug