-
Couldn't load subscription status.
- Fork 2.3k
[Pull-based Ingestion] Add Kafka offset based consumer lag in pull-based ingestion #19635
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
4382598 to
11aedfc
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #19635 +/- ##
============================================
+ Coverage 73.08% 73.10% +0.01%
+ Complexity 70746 70737 -9
============================================
Files 5725 5725
Lines 323793 323845 +52
Branches 46882 46888 +6
============================================
+ Hits 236659 236733 +74
+ Misses 68083 67988 -95
- Partials 19051 19124 +73 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
36a5c9f to
0bbd545
Compare
plugins/ingestion-kafka/src/main/java/org/opensearch/plugin/kafka/KafkaPartitionConsumer.java
Outdated
Show resolved
Hide resolved
|
❌ Gradle check result for 05e29a6: null Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
❌ Gradle check result for 05e29a6: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
05e29a6 to
f41b4f9
Compare
|
❌ Gradle check result for f41b4f9: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
❌ Gradle check result for f41b4f9: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
plugins/ingestion-kinesis/src/main/java/org/opensearch/plugin/kinesis/KinesisShardConsumer.java
Show resolved
Hide resolved
server/src/main/java/org/opensearch/cluster/metadata/IndexMetadata.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Varun Bharadwaj <varunbharadwaj1995@gmail.com>
Signed-off-by: Varun Bharadwaj <varunbharadwaj1995@gmail.com>
Signed-off-by: Varun Bharadwaj <varunbharadwaj1995@gmail.com>
5498f7f to
cc42f52
Compare
|
❌ Gradle check result for cc42f52: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Varun Bharadwaj <varunbharadwaj1995@gmail.com>
cc42f52 to
3180493
Compare
…sed ingestion (opensearch-project#19635) * include pointer based lag metric for pull-based ingestion * set -1 kafka lag if errors occur during offset lag computation * add setting for pointer based lag interval * update the lag interval setting to timeValue type --------- Signed-off-by: Varun Bharadwaj <varunbharadwaj1995@gmail.com>
…sed ingestion (opensearch-project#19635) * include pointer based lag metric for pull-based ingestion * set -1 kafka lag if errors occur during offset lag computation * add setting for pointer based lag interval * update the lag interval setting to timeValue type --------- Signed-off-by: Varun Bharadwaj <varunbharadwaj1995@gmail.com>
Description
Pull-based ingestion today only has stats for time based lag. This is computed based on incoming message timestamp. The time based lag will be 0 if ingestion is paused or there is a write block. This PR adds offset (pointer) based lag which gives back real lag metric irrespective of ingestion status. Note that this metric is only supported for Kafka. Kinesis implementation does not emit this metric.
The offset based lag looks up the latest offset available in the respective Kafka partition and computes lag as the difference between the Kafka end offset and last consumed offset. This is computed periodically once every 10 seconds by default, but is configurable. Setting the interval to 0 disables this metric.
For example, if we have a large batch of messages written in a short window, the offset based lag gives a better view of ingestion catching up. The following example below shows the lag when ingestion is slow.
Related Issues
Follow up for #19607. This metric should help detect growing lag when ingestion is paused.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.