Skip to content

Conversation

@varunbharadwaj
Copy link
Contributor

@varunbharadwaj varunbharadwaj commented Oct 25, 2025

Description

This PR refactors the pull-based indexing flow to support message mappers. A default message mapper is created to retain current behavior. Alternatively, a raw payload mapper is added to support ingesting from any given streaming source.

In the raw payload mode, the Kafka offset / Kinesis sequence number will be used as the document ID. This will ensure duplicate documents are not created on rewind/replay. Document versioning will not be supported, and only an eventually consistent view of documents can be expected on message replays (as older message can potentially overwrite newer one on replay, until the lag is caught up). This will be an append-only indexing mode.

This model should allow the flexibility to support other formats in the future, when needed.

Related Issues

Resolves #19548

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions github-actions bot added enhancement Enhancement or improvement to existing feature or request Indexing Indexing, Bulk Indexing and anything related to indexing labels Oct 25, 2025
@varunbharadwaj varunbharadwaj force-pushed the vb/mappersupport branch 2 times, most recently from 3fab3ae to 72773f3 Compare October 25, 2025 05:10
@varunbharadwaj varunbharadwaj changed the title [Pull-based Ingestion] Support message mappers to support different input formats [Pull-based Ingestion] Support message mappers to support different input formats and raw payloads Oct 25, 2025
@github-actions
Copy link
Contributor

❌ Gradle check result for 72773f3: TIMEOUT

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Contributor

❌ Gradle check result for 72773f3: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Contributor

❌ Gradle check result for 72773f3: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Contributor

❌ Gradle check result for 72773f3: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Contributor

✅ Gradle check result for 85dae2e: SUCCESS

Signed-off-by: Varun Bharadwaj <varunbharadwaj1995@gmail.com>
@github-actions
Copy link
Contributor

❌ Gradle check result for e9e9f3f: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Contributor

✅ Gradle check result for e9e9f3f: SUCCESS

@codecov
Copy link

codecov bot commented Oct 27, 2025

Codecov Report

❌ Patch coverage is 92.72727% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 73.10%. Comparing base (753c135) to head (e9e9f3f).
⚠️ Report is 4 commits behind head on main.

Files with missing lines Patch % Lines
...g/opensearch/cluster/metadata/IngestionSource.java 75.00% 0 Missing and 2 partials ⚠️
.../pollingingest/mappers/IngestionMessageMapper.java 88.23% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #19765      +/-   ##
============================================
- Coverage     73.10%   73.10%   -0.01%     
+ Complexity    70959    70932      -27     
============================================
  Files          5737     5740       +3     
  Lines        324766   324816      +50     
  Branches      46981    46986       +5     
============================================
+ Hits         237425   237460      +35     
- Misses        68226    68245      +19     
+ Partials      19115    19111       -4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement Enhancement or improvement to existing feature or request Indexing Indexing, Bulk Indexing and anything related to indexing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature Request] Make pull-based ingestion work with OTel collector out of the box

1 participant