Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Improve Doctest framework to sort result rows consistently #1096

Open
dai-chen opened this issue Nov 23, 2022 · 1 comment
Open
Labels
enhancement New feature or request infrastructure Changes to infrastructure, testing, CI/CD, pipelines, etc.

Comments

@dai-chen
Copy link
Collaborator

Is your feature request related to a problem?

Currently, Doctest module depends on backend and use the result set for comparison directly. Specifically, whenever there is filtering operation (SQL WHERE clause or PPL where command), sorting by _doc field is added automatically. This is supposed to be special case and moved to client side.

What solution would you like?

Sort the result rows in consistent way before formatting and comparison. This is similar as what our comparison test framework does. This approach may be favored over the alternative as follows.

What alternatives have you considered?

  1. Add ORDER BY clause or sort command to each SQL/PPL query to make sure consistent order in result set. This may complicate the sample query and thus make it unclear to users.
  2. Remove the default sorting and fix all doctest manually. However, the result maybe still not deterministic and the test may fail if the result rows is returned in different order.

Do you have any additional context?

Code: https://github.com/opensearch-project/sql/blob/2.x/opensearch/src/main/java/org/opensearch/sql/opensearch/request/OpenSearchRequestBuilder.java#L138

  public void pushDown(QueryBuilder query) {
    ...
    if (sourceBuilder.sorts() == null) {
      sourceBuilder.sort(DOC_FIELD_NAME, ASC); // Make sure consistent order
    }
  }

Doctest depends on the logic above:

    ppl_cmd.process('source=accounts | where account_number=1 or gender="F" | fields account_number, gender')Expected:
    fetched rows / total rows = 2/2
    +------------------+----------+
    | account_number   | gender   |
    |------------------+----------|
    | 1                | M        |
    | 13               | F        |
    +------------------+----------+
Got:
    fetched rows / total rows = 2/2
    +------------------+----------+
    | account_number   | gender   |
    |------------------+----------|
    | 13               | F        |
    | 1                | M        |
    +------------------+----------+
@dai-chen dai-chen added enhancement New feature or request infrastructure Changes to infrastructure, testing, CI/CD, pipelines, etc. labels Nov 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request infrastructure Changes to infrastructure, testing, CI/CD, pipelines, etc.
Projects
None yet
Development

No branches or pull requests

1 participant