-
Notifications
You must be signed in to change notification settings - Fork 181
[BugFix] Fix the bug when boolean comparison condition is simplifed to field #5071
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[BugFix] Fix the bug when boolean comparison condition is simplifed to field #5071
Conversation
Signed-off-by: Songkan Tang <songkant@amazon.com>
📝 WalkthroughSummary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings. WalkthroughConvert boolean field predicates earlier in Calcite traversal and predicate analysis to emit exact boolean term or negated-term queries; add unit and integration tests, REST YAML test, and expected explain-plan YAMLs covering boolean pushdown cases. Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Client
participant Planner as CalcitePlanner
participant Rex as CalciteRexNodeVisitor
participant Analyzer as PredicateAnalyzer
participant QExpr as QueryExpression
participant DSL as OpenSearch DSL
Client->>Planner: submit SQL with boolean predicate
Planner->>Rex: translate Rex nodes (compare / NOT)
Rex->>Planner: rewrite != / NOT -> IS_NOT_* when applicable
Planner->>Analyzer: analyzeExpression(filter)
Analyzer->>Analyzer: detect NamedFieldExpression.isBooleanType()
Analyzer->>QExpr: convert boolean field -> isTrue()/isFalse()/isNotTrue()/isNotFalse()
QExpr->>DSL: emit TermQuery or must_not TermQuery (combined with query_string)
DSL-->>Planner: return pushed-down DSL
Planner-->>Client: explain/execute with pushed-down boolean term
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Comment |
Signed-off-by: Songkan Tang <songkant@amazon.com>
| Content-Type: 'application/json' | ||
| ppl: | ||
| body: | ||
| query: source=test-boolean | where is_internal=true | fields name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using failed query source=test url=http | where is_internal=true
in #5054
| // Handle NOT(IS_TRUE(boolean_field)) - convert to term query with false value | ||
| // This covers cases where IS_TRUE was explicitly applied | ||
| if (expr instanceof SimpleQueryExpression simpleExpr && simpleExpr.isBooleanFieldIsTrue()) { | ||
| return QueryExpression.create(simpleExpr.rel).isFalse(); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- (NOT boolean_field = true) will return fields include ture, null and missing fields
- but boolean_field=false only return fields has false value.
| // generate a term query with value true. | ||
| // When called on an already-evaluated predicate (builder already set), | ||
| // return as-is. | ||
| if (builder == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to override isTrue and not API for NamedFieldExpression instead of changing SimpleQueryExpression?
Signed-off-by: Songkan Tang <songkant@amazon.com>
Signed-off-by: Songkan Tang <songkant@amazon.com>
Signed-off-by: Songkan Tang <songkant@amazon.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@integ-test/src/yamlRestTest/resources/rest-api-spec/test/issues/5054.yml`:
- Around line 1-15: The test uses an index named "test" and currently doesn't
clean it up; update the YAML to ensure index isolation by adding explicit delete
steps for the "test" index in both the setup and teardown blocks (or replace
"test" with a generated unique name), e.g., add a do: delete index action before
the test runs and another delete after the test completes so the index cannot
leak state or conflict with other tests; reference the existing setup/teardown
blocks and the index name "test" when making these changes.
| setup: | ||
| - do: | ||
| query.settings: | ||
| body: | ||
| transient: | ||
| plugins.calcite.enabled: true | ||
|
|
||
| --- | ||
| teardown: | ||
| - do: | ||
| query.settings: | ||
| body: | ||
| transient: | ||
| plugins.calcite.enabled: false | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ensure index isolation by cleaning up test before/after use.
Right now the test can fail or leak state if an index named test already exists or is reused. Add a cleanup step (or use a unique index name) to keep this test independent.
🧹 Suggested cleanup (align with existing YAML REST test patterns)
setup:
- do:
query.settings:
body:
transient:
plugins.calcite.enabled: true
+ - do:
+ indices.delete:
+ index: test
+ ignore: 404
---
teardown:
- do:
query.settings:
body:
transient:
plugins.calcite.enabled: false
+ - do:
+ indices.delete:
+ index: test
+ ignore: 404As per coding guidelines: Tests must not rely on execution order; ensure test independence.
Also applies to: 23-34
🤖 Prompt for AI Agents
In `@integ-test/src/yamlRestTest/resources/rest-api-spec/test/issues/5054.yml`
around lines 1 - 15, The test uses an index named "test" and currently doesn't
clean it up; update the YAML to ensure index isolation by adding explicit delete
steps for the "test" index in both the setup and teardown blocks (or replace
"test" with a generated unique name), e.g., add a do: delete index action before
the test runs and another delete after the test completes so the index cannot
leak state or conflict with other tests; reference the existing setup/teardown
blocks and the index name "test" when making these changes.
Signed-off-by: Songkan Tang <songkant@amazon.com>
Signed-off-by: Songkan Tang <songkant@amazon.com>
Signed-off-by: Songkan Tang <songkant@amazon.com>
Signed-off-by: Songkan Tang <songkant@amazon.com>
| if (operand instanceof NamedFieldExpression namedField && namedField.isBooleanType()) { | ||
| return booleanOp.apply(QueryExpression.create(namedField)); | ||
| } | ||
| // IS_TRUE on a predicate (already evaluated QueryExpression) is allowed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will other operation like IS_FALSE, IS_NOT_TRUE, IS_NOT_FALSE apply to QueryExpression?
Signed-off-by: Songkan Tang <songkant@amazon.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In
`@opensearch/src/main/java/org/opensearch/sql/opensearch/request/PredicateAnalyzer.java`:
- Around line 603-626: The boolean postfix handling currently calls
QueryExpression.isFalse()/isNotFalse()/isNotTrue() which overwrite any existing
builder; change the branch that handles operand instanceof QueryExpression qe to
avoid calling those mutators and instead preserve/wrap the existing builder: for
IS_TRUE/IS_NOT_FALSE return qe as-is, and for IS_FALSE/IS_NOT_TRUE return a
negated form of qe (implement QueryExpression.negate() or wrap qe.getBuilder()
into a BoolQuery with mustNot) so predicates like (age > 30) IS FALSE are
expressed by negating the existing predicate builder rather than replacing it
with a term query; keep the existing boolean-field handling
(NamedFieldExpression) unchanged.
opensearch/src/main/java/org/opensearch/sql/opensearch/request/PredicateAnalyzer.java
Show resolved
Hide resolved
Signed-off-by: Songkan Tang <songkant@amazon.com>
Signed-off-by: Songkan Tang <songkant@amazon.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In
`@opensearch/src/main/java/org/opensearch/sql/opensearch/request/PredicateAnalyzer.java`:
- Around line 603-625: The boolean operator handling in PredicateAnalyzer
currently calls booleanOp.apply(qe) for any QueryExpression operand, but
CompoundQueryExpression does not implement isTrue/isNotTrue and thus calling
those methods throws; update the branch that handles operand instanceof
QueryExpression to detect CompoundQueryExpression (or other predicate
QueryExpression subclasses) and directly route IS_TRUE and IS_NOT_TRUE to the
predicate-handling path instead of invoking qe.isTrue/ qe.isNotTrue;
specifically, inside the if (operand instanceof QueryExpression qe) block check
if qe is a CompoundQueryExpression (or predicate-type) and for call.getKind() ==
IS_TRUE / IS_NOT_TRUE return the appropriate predicate query (the same output
produced for NamedFieldExpression boolean predicates) or otherwise fall back to
booleanOp.apply(qe) for supported QueryExpression implementations.
🧹 Nitpick comments (1)
opensearch/src/main/java/org/opensearch/sql/opensearch/request/PredicateAnalyzer.java (1)
229-235: Consider recording analyzed nodes for the top-level boolean-field shortcut.This keeps
analyzedNodesconsistent with thetryAnalyzeOperandpath and improves downstream partial-pushdown bookkeeping.🔧 Suggested tweak
- if (result instanceof NamedFieldExpression namedField && namedField.isBooleanType()) { - return QueryExpression.create(namedField).isTrue(); - } + if (result instanceof NamedFieldExpression namedField && namedField.isBooleanType()) { + QueryExpression qe = QueryExpression.create(namedField).isTrue(); + qe.updateAnalyzedNodes(expression); + return qe; + }
| // Handle boolean field operators: IS_TRUE, IS_FALSE, IS_NOT_TRUE, IS_NOT_FALSE | ||
| // These generate term queries for exact boolean value matching or mustNot queries | ||
| // for negated matching (which includes null/missing documents). | ||
| Function<QueryExpression, QueryExpression> booleanOp = | ||
| switch (call.getKind()) { | ||
| case IS_TRUE -> QueryExpression::isTrue; | ||
| case IS_FALSE -> QueryExpression::isFalse; | ||
| case IS_NOT_TRUE -> QueryExpression::isNotTrue; | ||
| case IS_NOT_FALSE -> QueryExpression::isNotFalse; | ||
| default -> null; | ||
| }; | ||
|
|
||
| if (booleanOp != null) { | ||
| Expression operand = call.getOperands().get(0).accept(this); | ||
| if (operand instanceof NamedFieldExpression namedField && namedField.isBooleanType()) { | ||
| return booleanOp.apply(QueryExpression.create(namedField)); | ||
| } | ||
| // Boolean operators on a predicate (already evaluated QueryExpression) are allowed | ||
| if (operand instanceof QueryExpression qe) { | ||
| return booleanOp.apply(qe); | ||
| } | ||
| throw new PredicateAnalyzerException( | ||
| call.getKind() + " can only be applied to boolean fields or predicates"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handle IS_TRUE / IS_NOT_TRUE on compound predicates to avoid unexpected exceptions.
CompoundQueryExpression doesn’t override isTrue/isNotTrue, so booleanOp.apply(qe) will throw and fall back to scripts even though predicates are “allowed” here. Consider routing these two operators directly for predicate operands.
🛠️ Suggested adjustment
if (booleanOp != null) {
Expression operand = call.getOperands().get(0).accept(this);
if (operand instanceof NamedFieldExpression namedField && namedField.isBooleanType()) {
return booleanOp.apply(QueryExpression.create(namedField));
}
- // Boolean operators on a predicate (already evaluated QueryExpression) are allowed
- if (operand instanceof QueryExpression qe) {
- return booleanOp.apply(qe);
- }
+ if (operand instanceof QueryExpression qe) {
+ return switch (call.getKind()) {
+ case IS_TRUE -> qe;
+ case IS_NOT_TRUE -> qe.not();
+ case IS_FALSE, IS_NOT_FALSE ->
+ throw new PredicateAnalyzerException(
+ call.getKind() + " can only be applied to boolean fields");
+ default -> booleanOp.apply(qe);
+ };
+ }
throw new PredicateAnalyzerException(
call.getKind() + " can only be applied to boolean fields or predicates");
}🤖 Prompt for AI Agents
In
`@opensearch/src/main/java/org/opensearch/sql/opensearch/request/PredicateAnalyzer.java`
around lines 603 - 625, The boolean operator handling in PredicateAnalyzer
currently calls booleanOp.apply(qe) for any QueryExpression operand, but
CompoundQueryExpression does not implement isTrue/isNotTrue and thus calling
those methods throws; update the branch that handles operand instanceof
QueryExpression to detect CompoundQueryExpression (or other predicate
QueryExpression subclasses) and directly route IS_TRUE and IS_NOT_TRUE to the
predicate-handling path instead of invoking qe.isTrue/ qe.isNotTrue;
specifically, inside the if (operand instanceof QueryExpression qe) block check
if qe is a CompoundQueryExpression (or predicate-type) and for call.getKind() ==
IS_TRUE / IS_NOT_TRUE return the appropriate predicate query (the same output
produced for NamedFieldExpression boolean predicates) or otherwise fall back to
booleanOp.apply(qe) for supported QueryExpression implementations.
Description
Fix the bug discovered in #5054. See root cause description in #5054 (comment)
Related Issues
Resolves #5054
Check List
--signoffor-s.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.