Skip to content

Add a new row count estimation mechanism for CalciteIndexScan #3605

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

qianheng-aws
Copy link
Collaborator

@qianheng-aws qianheng-aws commented May 6, 2025

Description

Add a new row count mechanism for CalciteIndexScan, which is a new TableScan implemented by us for opensearch index.

Before this PR, this operator reused the same cost estimation logic as its parent class TableScan, which provided a constant value for its cost (i.e., 100 rows). However, since we also support pushing down operators like Project, Filter, Aggregation, etc. into this scan operator, and some operators actually affect the output rows of scan, we need a new implementation for this scan operator. Otherwise, in some cases, the Calcite optimizer will pick incorrect plans that don't actually have the lowest cost.

To estimate the cost of CalciteIndexScan more precisely, we need to imitate the row estimation logic implemented in RelMdRowCount. To make the Calcite optimizer prefer plans with push down rather than without, we add a factor (0.9) multiplied to the original row count of scan.

Background

  • In Calcite, by default, the costs of an operator are estimated completely based on their row count, and it will never let the result of row count goes below 1.

Related Issues

Resolves
#3557
#3556

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Heng Qian <qianheng@amazon.com>
Signed-off-by: Heng Qian <qianheng@amazon.com>
@qianheng-aws qianheng-aws changed the title Add a new row count mechanism for CalciteIndexScan Add a new row count estimation mechanism for CalciteIndexScan May 6, 2025
Signed-off-by: Heng Qian <qianheng@amazon.com>
@noCharger noCharger added the calcite calcite migration releated label May 6, 2025
* The estimated row count of this index scan operator. Initialed with the value of setting {@link
* QUERY_SIZE_LIMIT}, and will be updated by the push down actions when estimating.
*/
private Double rowCount;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For scan operator, It should be index docs count? the query.size_limit configuration sets the maximum number of rows that a query can return, but not the index scan.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, to be more precisely, it should be docs count.

But it still works fine here to give a constant value here, because:

  1. By default behavior without this PR, Calcite will also provide a constant value 100
  2. We only need a relative value instead of a precise value for the cost computing, especially for the single table scanning. In the cases of multi-table joining, it should make difference if we introduce optimization like join-reorder, map-join, but seems those optimization only make sense in distribution environment.
  3. Using the real doc count will introduce one more request but won't have much worth as said in 2.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use a fixed value (e.g., 10K) or a different setting? Using query.size_limit, which is unrelated to cost estimation, can be misleading. We don't expect changes to query.size_limit to affect the query pushdown logic, correct?

private Double rowCount;

/** The status to indicate whether the row count has been estimated. */
private boolean rowCountEstimated = false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[question] why need this flag? what case will the scan be estimated multiple times? and how many times? Is it possible to remove the member variable rowCount in class? The method estimateRowCount should work independently.

Copy link
Collaborator Author

@qianheng-aws qianheng-aws May 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The estimateRowCount will be called when:

  1. Computing the cost of this operator itself
  2. Computing the cost of its parent operator and the computing logic of that is based on the input row count. The times depends on how many parent operators will be generated during the optimization process. For example, when having plan project-agg(avg)->scan, the parent operators of scan will at least includes agg(avg), agg(sum/count), EnumAgg and may be project if have optimization rule of pushing project through agg.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, keep this flag would be fine. But is it possible to remove the member variable rowCount in class? The method estimateRowCount() should work independently.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rowCount and rowCountEstimated is viewed as kind of cache for the final estimated row count.

It's OK to remove this cache since other operators don't have such cache.

Signed-off-by: Heng Qian <qianheng@amazon.com>
LantaoJin
LantaoJin previously approved these changes May 7, 2025
Signed-off-by: Heng Qian <qianheng@amazon.com>
Signed-off-by: Heng Qian <qianheng@amazon.com>
Signed-off-by: Heng Qian <qianheng@amazon.com>
Copy link
Collaborator

@penghuo penghuo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
calcite calcite migration releated
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants