-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use constant value supplier to measure ingestion delay metrics #12957
base: master
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #12957 +/- ##
============================================
+ Coverage 61.75% 62.16% +0.41%
+ Complexity 207 198 -9
============================================
Files 2436 2502 +66
Lines 133233 136468 +3235
Branches 20636 21120 +484
============================================
+ Hits 82274 84840 +2566
- Misses 44911 45344 +433
- Partials 6048 6284 +236
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is the correct fix. We rely on the callback to track the ingestion delay.
I feel the real issue is this metric is not removed after the consumer is destroyed.
There are two issues:
IMO, we need to fix both the issues. |
It is under the |
Problem
In multiple Pinot deployments, we have seen the REALTIME_INGESTION_DELAY_MS and END_TO_END_REALTIME_INGESTION_DELAY_MS metrics monotonically grow even though there were no active Kafka consumers.
How did I figure out that there are no active Kafka consumers?
consumingSegmentsInfo API returned the below response:
The above response shows that consumers are not connected. I also checked the thread dump, and there are no Kafka consumer threads.
Image showing REALTIME_INGESTION_DELAY_MS grow monotonically
Zoomed out image showing REALTIME_INGESTION_DELAY_MS grow monotonically every minute
Solution
Changed the supplier to provide constant value to publish REALTIME_INGESTION_DELAY_MS and END_TO_END_REALTIME_INGESTION_DELAY_MS metrics.
Ingestion delay metrics are published for every batch of messages fetched off Kafka HERE, and this will continue to work when Kafka consumers are active.
bugfix