Skip to content

Reindex: Remove ability to sort #47567

Open
@henningandersen

Description

@henningandersen

As part of the reindex job specification sorting can be specified. Documentation describes that this can be used in combination with max_docs to extract either a specific or a random subset of data.

However, specifying sorting is not compatible with the new upcoming resilient reindex mechanism, since this relies on sorting by seq_no. Any reindex request that sorts by anything but seq_no first will not be resilient.

When copying the full data set, sorting does not really make a difference, the net end result will be the same. Extracting subsets of data can likely be done by adding queries instead. To avoid having cases where reindex is not resilient, I propose to deprecate sorting in reindex in 7.x and remove sorting from reindex in 8.0.

This issue is created to gather feedback on this proposal. If you rely on being able to sort while reindexing, please let us know here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    :Distributed Indexing/ReindexIssues relating to reindex that are not caused by issues further down>deprecationTeam:Distributed (Obsolete)Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions