-
Notifications
You must be signed in to change notification settings - Fork 623
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add caching for the topics metadata #75
Conversation
Sorry for the big PR. I upgraded sarama to v1.19.0 because the client metadata was being cleared. Even if a larger Setting the |
Setting the |
@jorgelbg do you think this can fix problem of having a gaps in dataseries ? I am now testing kafka-exporter in our staging environment and this happen all over again. What I have found is that this gaps occur in a similar timefreames when metadata gets refreshed (note that this io timeouts also happen occasionally):
For me it is a major issue that prevents me from using this exporter inside of our alerting chain. I've posted similar comments but see no activity for a longer time ... |
looks good to me. Sorry for the delayed response. |
@danielqsj don't worry, thanks for taking the time! @bojleros this could help (I think) because it will reduce the additional time that refreshing the metadata causes, especially if you have a big cluster. Increasing the scrape interval in prometheus could also help. Reading the lag/offset data and sending the response back could also be time consuming if you have a lot of topics. |
I am going to start my tests tomorrow morning. Our clusters are rather small with tightly monitored performance. I am sure we have resources and not hitting fd limit. We do also have our own tool (https://github.com/msales/kage) running next to the kafka_exporters. It is how this gaps halted us before using kafka_exporter on production. I am not an kage developer but maybe you will be interested in comparing code of both projects in search for possible improvements. Apart of that i saw that kafka exporter is producing a new values every each minute. Is it feasible to have it refreshing values each 15 seconds in order to fit our default scrape duration ? Kind Regards, |
This implements the feature mentioned in #74. Adding a configurable caching interval for the topics metadata. Only the metadata is cached, the offset/lag is requested/calculated everytime so this should not affect the metrics.
Depending on the interval configured (30s by default) could be the case that a new topic is added and is not picked up immediately.