-
Notifications
You must be signed in to change notification settings - Fork 154
Add a new row count estimation mechanism for CalciteIndexScan #3605
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Heng Qian <qianheng@amazon.com>
Signed-off-by: Heng Qian <qianheng@amazon.com>
Signed-off-by: Heng Qian <qianheng@amazon.com>
* The estimated row count of this index scan operator. Initialed with the value of setting {@link | ||
* QUERY_SIZE_LIMIT}, and will be updated by the push down actions when estimating. | ||
*/ | ||
private Double rowCount; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For scan operator, It should be index docs count? the query.size_limit configuration sets the maximum number of rows that a query can return, but not the index scan.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, to be more precisely, it should be docs count.
But it still works fine here to give a constant value here, because:
- By default behavior without this PR, Calcite will also provide a constant value 100
- We only need a relative value instead of a precise value for the cost computing, especially for the single table scanning. In the cases of multi-table joining, it should make difference if we introduce optimization like join-reorder, map-join, but seems those optimization only make sense in distribution environment.
- Using the real doc count will introduce one more request but won't have much worth as said in 2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use a fixed value (e.g., 10K) or a different setting? Using query.size_limit, which is unrelated to cost estimation, can be misleading. We don't expect changes to query.size_limit to affect the query pushdown logic, correct?
private Double rowCount; | ||
|
||
/** The status to indicate whether the row count has been estimated. */ | ||
private boolean rowCountEstimated = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[question] why need this flag? what case will the scan be estimated multiple times? and how many times? Is it possible to remove the member variable rowCount
in class? The method estimateRowCount
should work independently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The estimateRowCount
will be called when:
- Computing the cost of this operator itself
- Computing the cost of its parent operator and the computing logic of that is based on the input row count. The times depends on how many parent operators will be generated during the optimization process. For example, when having plan
project-agg(avg)->scan
, the parent operators ofscan
will at least includesagg(avg)
,agg(sum/count)
,EnumAgg
and may beproject
if have optimization rule of pushingproject
throughagg
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, keep this flag would be fine. But is it possible to remove the member variable rowCount
in class? The method estimateRowCount()
should work independently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rowCount
and rowCountEstimated
is viewed as kind of cache for the final estimated row count.
It's OK to remove this cache since other operators don't have such cache.
Signed-off-by: Heng Qian <qianheng@amazon.com>
Signed-off-by: Heng Qian <qianheng@amazon.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Description
Add a new row count mechanism for
CalciteIndexScan
, which is a new TableScan implemented by us for opensearch index.Before this PR, this operator reused the same cost estimation logic as its parent class TableScan, which provided a constant value for its cost (i.e., 100 rows). However, since we also support pushing down operators like Project, Filter, Aggregation, etc. into this scan operator, and some operators actually affect the output rows of scan, we need a new implementation for this scan operator. Otherwise, in some cases, the Calcite optimizer will pick incorrect plans that don't actually have the lowest cost.
To estimate the cost of
CalciteIndexScan
more precisely, we need to imitate the row estimation logic implemented inRelMdRowCount
. To make the Calcite optimizer prefer plans with push down rather than without, we add a factor (0.9) multiplied to the original row count of scan.Background
Related Issues
Resolves
#3557
#3556
Check List
--signoff
.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.