You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Generic configuration of Prometheus plugin and OpenTSDB plugin
monitoring_pods is enabled in the Prometheus plugin
Expected behavior:
Metrics are retrieved successfully from all pods that contain Prometheus annotations, even if I added a pod or deleted a pod, telegraf picks up on that and start/stop retrieving metrics from the pod.
Actual behavior:
Metrics are retrieved successfully from all pods that are present the minute I deploy telegraf, but if I add a pod then telegraf doesn't detect it and if I delete a pod then telegraf starts showing errors because the pod's IP is not present anymore so it can't connect to it.
The text was updated successfully, but these errors were encountered:
I saw that it was a problem of openshift roles so I created a service account with watch verb and it detects deletion and also the creation of pods but it doesn't scrape the added pods and when I saw the plugin it seems as if it tries to get the pod's IP but it finds the string is empty so it can't scrape metrics.
I modified the Prometheus plugin by taking into account the k8s.EventModified where now I register the pod again after modification event so that this way it handles the case where the IP assignment is too late. I do suggest you that you do not leave the case k8s.EventModified empty and register again the pod if there were any big changes ( like IP :) )
Relevant telegraf.conf:
Input plugin: Prometheus
Output plugin : OpenTSDB
System info:
Telegraf version : 1.9.1
Platform: Openshift 3.7
Steps to reproduce:
Expected behavior:
Metrics are retrieved successfully from all pods that contain Prometheus annotations, even if I added a pod or deleted a pod, telegraf picks up on that and start/stop retrieving metrics from the pod.
Actual behavior:
Metrics are retrieved successfully from all pods that are present the minute I deploy telegraf, but if I add a pod then telegraf doesn't detect it and if I delete a pod then telegraf starts showing errors because the pod's IP is not present anymore so it can't connect to it.
The text was updated successfully, but these errors were encountered: