Skip to content

Conversation

@yuancu
Copy link
Collaborator

@yuancu yuancu commented Nov 19, 2025

Description

Because nested fields are indexed as hidden documents, we cannot query them directly. Instead, we have to use the nested query to access them.

For example, for the following mapping:

{
  "mappings": {
    "properties": {
      "id": {"type": "keyword"},
      "items": {
        "type": "nested",
        "properties": {
          "name": {"type": "keyword"}
        }
      }
    }
  }
}

if we want to use term query to match those items with name banana, instead of using:

{
  "query": {
    "term": {
      "items.name": {
        "value": "banana",
        "boost": 1
      }
    }
  }
}

We should use

{
  "query": {
    "nested": {
      "path": "items",
      "query": {
        "term": {
          "items.name": {
            "value": "banana",
            "boost": 1
          }
        }
      }
    }
  }
}

Here, the nested clause steps down into the nested items field. It no longer has access to fields in the root document, nor fields in any other nested document.

Work items

  • Fix the case in the issue [BUG] PPL filter on computed fields from nested paths returns zero results #4508: script's access to a single nested field.
  • Support querying deep nested objects: if a field of a nested object is another nested object, we should construct a nested query using the deepest level
  • Support nested query in other types of filter.
    • Term query
    • Range query
    • Multi-terms query
  • Support filters on both nested and root objects, where they can be separated. E.g. | where items.name = 'banana' and id = 2 can be separated to two filters: items.name = 'banana' stays under a nested query, while id = 2 stays on a normal term query, combined with a bool query. Already implemented.
  • Support filters on nested and root objects, where they can not be separated

Related Issues

Resolves #4508

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • New PPL command checklist all confirmed.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff or -s.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

nestedPaths));
}
if (!nestedPaths.isEmpty()) {
return nestedQuery(nestedPaths.get(0), scriptQuery, ScoreMode.None);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we have supported Agg script and Sort script now, I think we need to identify whether it's a filter script before wrapping it with nested query. And please add a test on agg script with nested fields.

Copy link
Collaborator Author

@yuancu yuancu Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, they still need nested queries to access nested fields. As a result, the following queries does not work (author is a nested field)

  • script sort: source=opensearch-sql_test_index_cascaded_nested | eval lower_name = lower(author.name) | sort lower_name
  • field sort: source=opensearch-sql_test_index_cascaded_nested | sort author.name

Should I fix them in this PR or another one?

For agg script, I added a test case. Yet I doubt I still miss many cases? Originally I thought they were only handled in AggregateAnalyzer, so I could raise another PR for nested fields in aggregations.

@opensearch-trigger-bot
Copy link
Contributor

This PR is stalled because it has been open for 2 weeks with no activity.

Copy link
Member

@LantaoJin LantaoJin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR doesn't count in the case that the filter is not pushed down.

source=t | head 10000 | where items.name = "xx"

Above query will return incorrect results.

This issue is similar to the #3696 which I am working on refactor to avoid the correctness issue in non-pushdown case.

@opensearch-trigger-bot
Copy link
Contributor

This PR is stalled because it has been open for 2 weeks with no activity.

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

# Conflicts:
#	integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
#	opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
…scripts

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In
`@opensearch/src/main/java/org/opensearch/sql/opensearch/request/PredicateAnalyzer.java`:
- Around line 1525-1532: Add a unit test in PredicateAnalyzerTest.java that
constructs a predicate/script expression which references multiple nested fields
under different hierarchies, invokes the PredicateAnalyzer analysis path that
triggers the nestedPaths logic in PredicateAnalyzer, and asserts that an
UnsupportedScriptException is thrown; also assert the exception message contains
the nestedPaths information (or substring like "Accessing multiple nested
fields") to validate the exact error scenario handled by the nestedPaths.size()
> 1 branch. Ensure the test uses the same helper/setup used by other
PredicateAnalyzer tests so it isolates the error case and runs fast.
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 41567e2 and a8826cf.

📒 Files selected for processing (1)
  • opensearch/src/main/java/org/opensearch/sql/opensearch/request/PredicateAnalyzer.java
🧰 Additional context used
📓 Path-based instructions (1)
**/*.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

**/*.java: Use PascalCase for class names (e.g., QueryExecutor)
Use camelCase for method and variable names (e.g., executeQuery)
Use UPPER_SNAKE_CASE for constants (e.g., MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
Prefer Optional<T> for nullable returns in Java
Avoid unnecessary object creation in loops
Use StringBuilder for string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code

Files:

  • opensearch/src/main/java/org/opensearch/sql/opensearch/request/PredicateAnalyzer.java

⚙️ CodeRabbit configuration file

**/*.java: - Flag methods >50 lines as potentially too complex - suggest refactoring

  • Flag classes >500 lines as needing organization review
  • Check for dead code, unused imports, and unused variables
  • Identify code reuse opportunities across similar implementations
  • Assess holistic maintainability - is code easy to understand and modify?
  • Flag code that appears AI-generated without sufficient human review
  • Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
  • Check for proper JavaDoc on public classes and methods
  • Flag redundant comments that restate obvious code
  • Ensure proper error handling with specific exception types
  • Check for Optional usage instead of null returns
  • Validate proper use of try-with-resources for resource management

Files:

  • opensearch/src/main/java/org/opensearch/sql/opensearch/request/PredicateAnalyzer.java
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Follow existing patterns in `CalciteRelNodeVisitor` and `CalciteRexNodeVisitor` for Calcite integration

Applied to files:

  • opensearch/src/main/java/org/opensearch/sql/opensearch/request/PredicateAnalyzer.java
🔇 Additional comments (6)
opensearch/src/main/java/org/opensearch/sql/opensearch/request/PredicateAnalyzer.java (6)

48-99: LGTM on new imports.

The added imports are necessary for the nested query functionality and follow standard conventions.


1209-1218: LGTM on nested query wrapping in SimpleQueryExpression.

The logic correctly wraps the builder with a nested query only when a non-empty nested path exists, using appropriate null-safety checks.


1473-1474: LGTM on new fields in ScriptQueryExpression.

The fieldTypes and referredFields fields properly support nested path resolution for script queries.


1516-1537: LGTM on ScriptQueryExpression.builder() nested handling.

The implementation correctly:

  1. Identifies nested paths from referred fields
  2. Throws a clear exception when multiple nested hierarchies are accessed (unsupported)
  3. Wraps with a nested query when exactly one nested path exists
  4. Returns a plain script query otherwise

This aligns with OpenSearch's requirement that nested fields must be accessed via nested queries.


1612-1639: LGTM on NamedFieldExpression changes.

The changes properly:

  • Add Lombok annotations for cleaner code
  • Consistently resolve nestedPath using Utils.resolveNestedPath where field type information is available
  • Use null as the default for nestedPath (aligning with the commit message intent and Strings.isNullOrEmpty() checks)
  • Mark nestedPath as @Nullable for documentation

1498-1503: This concern is misdirected. If rowType were null, the NPE would occur at line 1490 (rowType.getFieldList()) before reaching lines 1498–1503. Additionally, in the Calcite framework, getRowType() is a fundamental contract method on RelNode that must never return null. No evidence shows that analyzeExpression receives null for the rowType parameter in any actual code path.

Likely an incorrect or invalid review comment.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines +1525 to +1532
if (nestedPaths.size() > 1) {
throw new UnsupportedScriptException(
String.format(
Locale.ROOT,
"Accessing multiple nested fields under different hierarchies in script is not"
+ " supported: %s",
nestedPaths));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Search for existing tests covering the multi-nested-path error scenario
rg -n "multiple nested" --type java

echo "---"

rg -n "different hierarchies" --type java

echo "---"

# Search for test files related to PredicateAnalyzer
find . -type f -name "*PredicateAnalyzerTest*" -o -name "*PredicateAnalyzer*Test*"

echo "---"

# Search for UnsupportedScriptException usage in test files
rg -n "UnsupportedScriptException" --type java | grep -i test

Repository: opensearch-project/sql

Length of output: 1432


🏁 Script executed:

#!/bin/bash
# Check PredicateAnalyzerTest.java for nested path tests
wc -l ./opensearch/src/test/java/org/opensearch/sql/opensearch/request/PredicateAnalyzerTest.java

echo "---"

# Search within PredicateAnalyzerTest.java for nested-related tests
rg -n "nested|UnsupportedScriptException" ./opensearch/src/test/java/org/opensearch/sql/opensearch/request/PredicateAnalyzerTest.java

echo "---"

# Check CalciteWhereCommandIT.java around line 141 for the test
sed -n '130,155p' ./integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteWhereCommandIT.java

Repository: opensearch-project/sql

Length of output: 1144


🏁 Script executed:

#!/bin/bash
# Check what test methods exist in PredicateAnalyzerTest.java related to script/filter
rg -n "void test|void \w+\(" ./opensearch/src/test/java/org/opensearch/sql/opensearch/request/PredicateAnalyzerTest.java | head -30

echo "---"

# Search for any mentions of script or multiple nested in PredicateAnalyzerTest
rg -n "script|Script" ./opensearch/src/test/java/org/opensearch/sql/opensearch/request/PredicateAnalyzerTest.java | head -20

Repository: opensearch-project/sql

Length of output: 3501


Add unit test for multi-nested-path error scenario to PredicateAnalyzerTest.java.

The exception for accessing multiple nested fields under different hierarchies is currently validated only by integration tests. Add a dedicated unit test in PredicateAnalyzerTest.java to isolate and verify this error case, improving test execution speed and maintainability.

🤖 Prompt for AI Agents
In
`@opensearch/src/main/java/org/opensearch/sql/opensearch/request/PredicateAnalyzer.java`
around lines 1525 - 1532, Add a unit test in PredicateAnalyzerTest.java that
constructs a predicate/script expression which references multiple nested fields
under different hierarchies, invokes the PredicateAnalyzer analysis path that
triggers the nestedPaths logic in PredicateAnalyzer, and asserts that an
UnsupportedScriptException is thrown; also assert the exception message contains
the nestedPaths information (or substring like "Accessing multiple nested
fields") to validate the exact error scenario handled by the nestedPaths.size()
> 1 branch. Ensure the test uses the same helper/setup used by other
PredicateAnalyzer tests so it isolates the error case and runs fast.

.filter(Predicate.not(Strings::isNullOrEmpty))
.distinct()
.collect(Collectors.toUnmodifiableList());
if (nestedPaths.size() > 1) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have test for this case?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a very corner case.

Copy link
Collaborator Author

@yuancu yuancu Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, testScriptFilterOnDifferentNestedHierarchyShouldThrow tests this case. It happens when users try to operate on two nested fields of different levels.

@qianheng-aws qianheng-aws merged commit ff82c67 into opensearch-project:main Jan 15, 2026
38 of 40 checks passed
opensearch-trigger-bot bot pushed a commit that referenced this pull request Jan 15, 2026
* lewSupport basic nested queries when applying filter on a nested filed

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

# Conflicts:
#	integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
#	opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java

* Test filter on both nested and root level fields

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* Support nested queries for simple queries: range, term, terms, etc

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* Add integration tests for filtering on nested

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* Add a error case for accessing different levels of nested objects in scripts

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* Add a test for accessing nested filter in filter in aggregation

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* Chores: remove unnecessary comments

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* Update plans

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* Remove unused plans

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* Use null instead of empty string as a default to nest path attribute

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

---------

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
(cherry picked from commit ff82c67)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
@yuancu yuancu deleted the issues/4508 branch January 15, 2026 06:46
yuancu added a commit that referenced this pull request Jan 15, 2026
…ies (#5051)

* Push down filters on nested fields as nested queries (#4825)

* lewSupport basic nested queries when applying filter on a nested filed

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

# Conflicts:
#	integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
#	opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java

* Test filter on both nested and root level fields

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* Support nested queries for simple queries: range, term, terms, etc

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* Add integration tests for filtering on nested

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* Add a error case for accessing different levels of nested objects in scripts

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* Add a test for accessing nested filter in filter in aggregation

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* Chores: remove unnecessary comments

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* Update plans

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* Remove unused plans

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* Use null instead of empty string as a default to nest path attribute

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

---------

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
(cherry picked from commit ff82c67)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* Downgrade to java 11

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

---------

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Yuanchun Shen <yuanchu@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] PPL filter on computed fields from nested paths returns zero results

3 participants