Skip to content

MINOR - Aggregation incident reindex fix#25741

Open
TeddyCr wants to merge 3 commits intomainfrom
FIX-Aggregation-incident-reindex
Open

MINOR - Aggregation incident reindex fix#25741
TeddyCr wants to merge 3 commits intomainfrom
FIX-Aggregation-incident-reindex

Conversation

@TeddyCr
Copy link
Collaborator

@TeddyCr TeddyCr commented Feb 6, 2026

MINOR - Aggregation incident reindex fix

Describe your changes:

Bug Fixes

  • OpenSearch/Elasticsearch aggregation builders not supporting sub-aggregations natively
  • Nested aggregations incorrectly consuming dimensions during traversal
  • NPE during test case resolution reindex

Type of change:

  • Bug fix
  • Improvement
  • New feature
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation

Checklist:

  • I have read the CONTRIBUTING document.
  • My PR title is Fixes <issue-number>: <short explanation>
  • I have commented on my code, particularly in hard-to-understand areas.
  • For JSON Schema changes: I updated the migration scripts or explained why it is not needed.

Bug fix

  • I have added a test that covers the exact scenario we are fixing. For complex issues, comment the issue number in the test for future reference.

Summary by Gitar

  • Aggregation traversal fix:
    • Modified SearchIndexUtils.traverseAggregationResults() to preserve dimensions through nested aggregations (structural wrappers don't produce bucket keys)
  • Native sub-aggregation support:
    • Implemented supportsSubAggregationsNatively() in ElasticNestedAggregations, OpenNestedAggregations, and OpenDateHistogramAggregations to attach sub-aggregations directly
  • New test coverage:
    • Added 8 integration tests in OpenSearchAggregationManagerIntegrationTest.java validating nested aggregations with date histograms and metrics

This will update automatically on new commits.


@TeddyCr TeddyCr requested a review from a team as a code owner February 6, 2026 23:26
@TeddyCr TeddyCr added the To release Will cherry-pick this PR into the release branch label Feb 6, 2026
@github-actions github-actions bot added Ingestion safe to test Add this label to run secure Github workflows on PRs labels Feb 6, 2026
harshach
harshach previously approved these changes Feb 6, 2026
@gitar-bot
Copy link

gitar-bot bot commented Feb 7, 2026

🔍 CI failure analysis for 6c28f0b: All CI failures are unrelated to this PR's Java aggregation changes: Maven tests fail in glossary workflow and AWS areas, Playwright tests have 31+ flaky tests, Python tests have Trino errors, and integration tests have disk space issues.

Issue

New Failures (Maven SonarCloud CI & Test Report):

  • maven-sonarcloud-ci (job 62815959793, run 21770231853)
  • Test Report (job 62825340778, run 21770230944) - reporting job

Previous Failures:

  • Maven PostgreSQL CI: AWS and Role test failures
  • Playwright E2E tests: 31+ flaky tests across 3 shards
  • Python tests (3.10 & 3.11): Trino connectivity errors
  • Integration tests: Disk space exhaustion

Root Cause

Maven SonarCloud CI Failures: Glossary Workflow and AWS Test Issues

Maven SonarCloud CI run:

  • 4 test failures total: 1 failure + 3 errors
  • Tests run: 7,918 | Failures: 1 | Errors: 3 | Skipped: 701

Specific failures:

  1. GlossaryTermResourceTest - test_GlossaryTermWorkflow_TermHasReviewers_UpdatedByNotReviewer_WorkflowTriggered
    • Expected: <Approved> but was: <In Review>
    • Glossary term approval workflow validation failure
    • Related ERROR logs: "Failed to publish event to GovernanceWorkflowChangeEvent due to Could not handle signal: process definition is suspended"

2-4. AwsCredentialsUtilTest (3 errors) - same as maven-postgresql-ci:

  • testBuildCredentialsProviderWithEmptyCredentials
  • testBuildCredentialsProviderWithNoCredentials
  • testBuildCredentialsProviderWithOnlyAccessKey
  • Error: IllegalArgumentException: AWS credentials not configured

These failures are unrelated to this PR because:

  1. This PR only modifies Java backend aggregation code (5 files):
    • SearchIndexUtils.java (search aggregation traversal logic)
    • 3 aggregation builder classes (nested/date_histogram native support)
    • 1 test file (OpenSearchAggregationManagerIntegrationTest.java)
  2. Zero changes to:
    • Glossary resource code or glossary term management
    • Governance workflow engine or workflow event consumers
    • AWS credentials utilities
    • Activity feed publishing
    • Any glossary-related code paths
  3. Failures are in completely separate functional areas - Glossary governance workflows and AWS credential configuration have no relation to search aggregation logic
  4. Workflow issues suggest environment problems - "process definition is suspended" indicates workflow engine state issues

Maven PostgreSQL CI Failures (Previously Analyzed)

  • 6 test failures: Role management + AWS credentials
  • Same AWS credential errors as SonarCloud CI
  • Unrelated to PR: No Role or AWS code modified

Playwright E2E Test Failures (Previously Analyzed)

  • 31+ flaky tests across 3 shards
  • Multiple unrelated UI features failing
  • Unrelated to PR: No UI/frontend code modified

Python Test Failures (Previously Analyzed)

  • Both Python 3.10 and 3.11: Identical 7 Trino connectivity errors
  • Unrelated to PR: No Python code modified

Integration Test Failures (Previously Analyzed)

  • Persistent disk space exhaustion
  • Unrelated to PR: No infrastructure changes

Details

PR Changes (Backend Aggregation Only):

  1. Fixed dimension preservation in nested aggregation traversal (SearchIndexUtils.java)
  2. Added native sub-aggregation support for nested/date_histogram aggregation types
  3. Added 8 Java integration tests validating backend aggregation fixes

All failures are in unrelated functional areas or environmental issues:

  • Maven tests: Glossary workflow, Role management, AWS credentials (no related code in PR)
  • Playwright tests: Severe E2E test infrastructure instability (no UI code in PR)
  • Python tests: Trino connectivity failures (no Python code in PR)
  • Integration tests: Persistent disk space exhaustion

Conclusion: The Maven test failures span multiple unrelated areas (Glossary governance workflows, Role management, AWS utilities) that are completely separate from this PR's search aggregation logic changes. The glossary workflow failure appears to be an environmental/workflow engine state issue ("process definition is suspended"). Combined with all other environmental failures, all CI failures remain unrelated to this PR's focused Java backend aggregation improvements.

Code Review ✅ Approved

Clean, well-tested bug fix that correctly addresses three issues: nested aggregation dimension consumption, unconditional leaf metric handling, and missing native sub-aggregation support. The 8 new tests provide good coverage of the fixed scenarios.

Tip

Comment Gitar fix CI or enable auto-apply: gitar auto-apply:on

Options

Auto-apply is off → Gitar will not commit updates to this branch.
Display: compact → Showing less information.

Comment with these commands to change:

Auto-apply Compact
gitar auto-apply:on         
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Ingestion safe to test Add this label to run secure Github workflows on PRs To release Will cherry-pick this PR into the release branch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants