-
Notifications
You must be signed in to change notification settings - Fork 1.1k
fix(search_family): Fix SORTBY option in FT.SEARCH for non-sortable fields and KNN search #4942
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(search_family): Fix SORTBY option in FT.SEARCH for non-sortable fields and KNN search #4942
Conversation
…ields and KNN search fixes dragonflydb#4939, dragonflydb#4934 Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>
Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>
lgtm |
it's quite a big change, while it's clear what it fixes, it's not clear what was missing or faulty and how it was fixed. |
@@ -50,6 +50,68 @@ void TraverseAllMatching(const DocIndex& index, const OpArgs& op_args, F&& f) { | |||
} while (cursor); | |||
} | |||
|
|||
bool IsSortableField(std::string_view field_identifier, const search::Schema& schema) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@romange All the code here was simply moved. Previously, we had non-class functions defined in a non-anonymous namespace, so I moved them into an anonymous namespace.
using SortIndiciesFieldsList = | ||
std::vector<std::pair<string_view /*identifier*/, string_view /*alias*/>>; | ||
|
||
std::pair<SearchFieldsList, SortIndiciesFieldsList> PreprocessAggregateFields( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was moved. But I can revert this
if (params.sort_option) { | ||
auto field_alias = params.sort_option->field.NameView(); | ||
|
||
auto comparator = [&](const SerializedSearchDoc* l_doc, const SerializedSearchDoc* r_doc) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All other updates are related to the issue with sorting the non-sortable fields. Maybe I can remove changes to the SearchField and use just string for SortParams. But it is not a big change
Here's what's happening: previously, sorting was performed inside the search logic itself, using a AST node — AstSortNode. This meant that sorting happened locally on each shard. And this will not work as needed. Actually, since we are sorting only inside shards I think we can find a test case where the final result is not globally sorted. I've now moved the sorting logic out of the shards into the coordinator thread — similar to how it's handled in the FT.AGGREGATE command |
For future, it is best to separate it into two PRs : the one that moves the code and does not change functionality and the one that changes functionality. otherwise it's a nightmare for a reviewer to review the code. |
comment: while I understand the bug, it is possible that sorting locally on each shard might still be beneficial in the future - because then you can merge-sort already sorted results coming from each shard in more efficient manner in |
The issue here is that we also support sorting for KNN search. In this case, we first perform KNN, then apply LIMIT and OFFSET (i.e., take a subarray of the sorted KNN results), and then sort that subarray by a field value. So, the possible solution here is to have separate logic for knn results and sort them in the coordinator thread. For basic sort we can still sort in the shard. |
auto l_it = l.find(field_alias); | ||
auto r_it = r.find(field_alias); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks really slow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will fix it for the next version.
And it is not very slow. Since we will need to allocate more memory to fix this
…ields and KNN search (#4942) * fix(search_family): Fix SORTBY option in FT.SEARCH for non-sortable fields and KNN search fixes #4939, #4934 Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io> * fix compilation error Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io> --------- Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io> Co-authored-by: Roman Gershman <roman@dragonflydb.io>
fixes #4939, #4934
Fixes two issues. They are good described in the tickets above.