Skip to content

Conversation

@yuancu
Copy link
Owner

@yuancu yuancu commented Jul 16, 2025

Description

Filter script pushdown with RelJson serialization in Calcite

Mirror of sql/pull/3859 to let copilot generate reviews

qianheng-aws and others added 16 commits June 23, 2025 18:23
Signed-off-by: Heng Qian <qianheng@amazon.com>
Signed-off-by: Heng Qian <qianheng@amazon.com>
Signed-off-by: Heng Qian <qianheng@amazon.com>
Signed-off-by: Heng Qian <qianheng@amazon.com>
…string

Signed-off-by: Songkan Tang <songkant@amazon.com>
Signed-off-by: Songkan Tang <songkant@amazon.com>
Signed-off-by: Songkan Tang <songkant@amazon.com>
Signed-off-by: Songkan Tang <songkant@amazon.com>
Signed-off-by: Songkan Tang <songkant@amazon.com>
Signed-off-by: Songkan Tang <songkant@amazon.com>
Signed-off-by: Songkan Tang <songkant@amazon.com>
Signed-off-by: Songkan Tang <songkant@amazon.com>
Signed-off-by: Songkan Tang <songkant@amazon.com>
@yuancu yuancu requested a review from Copilot July 16, 2025 04:57
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds filter script pushdown using Calcite’s RelJson serialization, replaces the legacy expression engine with a compounded script engine, and updates both core and integration tests to reflect the new opensearch_compounded_script language.

  • Introduce RelJsonSerializer and OpenSearchRelInputTranslator for serializing/deserializing RexNodes and field types
  • Replace ExpressionScriptEngine with CompoundedScriptEngine and wire it up in SQLPlugin
  • Enhance PredicateAnalyzer to emit ScriptQueryExpression and mark pushdown contexts as SCRIPT
  • Update integration tests and expected outputs to use opensearch_compounded_script and handle timestamp wildcards

Reviewed Changes

Copilot reviewed 34 out of 34 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
plugin/src/main/java/org/opensearch/sql/plugin/SQLPlugin.java Switch to CompoundedScriptEngine using Calcite
opensearch/src/main/java/org/opensearch/sql/opensearch/storage/serialization/RelJsonSerializer.java New RelJson-based serializer for RexNode payloads
opensearch/src/main/java/org/opensearch/sql/opensearch/storage/script/CompoundedScriptEngine.java Introduce new script engine delegating to V2 or Calcite
opensearch/src/main/java/org/opensearch/sql/opensearch/request/PredicateAnalyzer.java Emit ScriptQueryExpression when analyzer falls back
opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java Classify filter pushdown as SCRIPT when needed
integ-test/.../expectedOutput/* Updated JSON expectations for opensearch_compounded_script
Comments suppressed due to low confidence (2)

opensearch/src/main/java/org/opensearch/sql/opensearch/storage/serialization/RelJsonSerializer.java:33

  • There are no unit tests for RelJsonSerializer. Add tests that serialize and then deserialize representative RexNode trees and field type maps to verify round-trip fidelity.
public class RelJsonSerializer {

benchmarks/src/jmh/java/org/opensearch/sql/expression/operator/predicate/MergeArrayAndObjectMapBenchmark.java:21

  • [nitpick] The method name testMerge no longer reflects the use of MergeRuleHelper. Consider renaming it to benchmarkMergeRuleHelper for clarity.
  public void testMerge() {

return new ScriptQueryBuilder(
new Script(
DEFAULT_SCRIPT_TYPE, EXPRESSION_LANG_NAME, serializer.serialize(node), emptyMap()));
DEFAULT_SCRIPT_TYPE, COMPOUNDED_LANG_NAME, serializer.serialize(node), emptyMap()));
Copy link

Copilot AI Jul 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pushed-down Script is created with an empty options map, so it defaults to the V2 engine instead of explicitly using Calcite. Include ENGINE_TYPE=CALCITE_ENGINE_TYPE in the options map to ensure the Calcite engine is used.

Suggested change
DEFAULT_SCRIPT_TYPE, COMPOUNDED_LANG_NAME, serializer.serialize(node), emptyMap()));
DEFAULT_SCRIPT_TYPE, COMPOUNDED_LANG_NAME, serializer.serialize(node),
ImmutableMap.of("ENGINE_TYPE", "CALCITE_ENGINE_TYPE")));

Copilot uses AI. Check for mistakes.

// Write bytes of all serializable contents
ByteArrayOutputStream output = new ByteArrayOutputStream();
ObjectOutputStream objectOutput = new ObjectOutputStream(output);
Copy link

Copilot AI Jul 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using Java ObjectOutputStream for envelope serialization can be slow and carries security risks. Consider switching to a pure JSON-based approach (e.g., Jackson) for the entire envelope and use try-with-resources to manage streams.

Copilot uses AI. Check for mistakes.
final RelDataTypeField field = rowType.getFieldList().get(input);
return rexBuilder.makeInputRef(field.getType(), input);
}
throw new RuntimeException("input field " + input + " is out of range");
Copy link

Copilot AI Jul 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Throwing a raw RuntimeException is not descriptive. Consider using IndexOutOfBoundsException or IllegalArgumentException to better convey the nature of the error.

Suggested change
throw new RuntimeException("input field " + input + " is out of range");
throw new IndexOutOfBoundsException("input field " + input + " is out of range");

Copilot uses AI. Check for mistakes.
@yuancu yuancu closed this Jul 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants