Closed
Description
Enhancement Description
- One-line enhancement description (can be used as a release note): Overhauling Kubernetes' metrics instrumentation to align with the Kubernetes' metrics with the Kubernetes Instrumentation Guidelines.
- Kubernetes Enhancement Proposal: https://github.com/kubernetes/enhancements/blob/master/keps/sig-instrumentation/20181106-kubernetes-metrics-overhaul.md
- Primary contact (assignee): @brancz
- Responsible SIGs: sig-instrumentation
- Enhancement target (which target equals to which milestone):
This is a cleanup so there are no stability milestones involved, however, to not break hard immediately, SIG Instrumentation is doing its best effort to inform about these changes in various ways as follows:
- Metrics that violated the Kubernetes instrumentation guidelines have been marked deprecated in Kubernetes 1.14, and where possible new metrics were added that did follow the guidelines. [DONE]
- Rename cadvisor metric labels to match instrumentation guidelines kubernetes#69099
- Change latency bucket size for API server metrics kubernetes#67476
- Change docker metrics to conform metrics guidelines kubernetes#72323
- Change kubelet metrics to conform metrics guidelines kubernetes#72470
- Fit RuntimeClass metrics to prometheus conventions kubernetes#73820
- Prevent apiserver's metrics from accidental registration. kubernetes#63924
- Expose kubelet health checks using new prometheus endpoint kubernetes#61369
- Change scheduler metrics to conform metrics guidelines kubernetes#72332
- Change proxy metrics to conform metrics guidelines kubernetes#72334
- Fix admission metrics in true units kubernetes#72343
- remove the deprecated admission metrics kubernetes#75279
- Use prometheus conventions for workqueue metrics kubernetes#71300
- convert latency/latencies in metrics name to duration kubernetes#74418
- Kubernetes 1.16 will remove the duplicate
pod_name
andcontainer_name
metric labels from cAdvisor metrics. For the 1.14 and 1.15 release allpod
,pod_name
,container
andcontainer_name
were available as a grace period. - Kubernetes 1.17 will remove the in 1.14 marked as deprecated metrics. As a stretch goal, if the metrics stability framework is in place, then in Kubernetes 1.17 the metrics will only be turned off by default through the stability framework. Should this not be available, then the metrics will be removed.
- Currently being discussed with each sig whose metrics are being removed.
- Alpha release target 1.16
- Stability framework is in place with metric verification/validation running
in CI. - Metrics which are deprecated in the metrics overhaul are marked as deprecated,
which can be overridden in a binary through a command line flag - No metrics can be marked as stable.
- Beta release target 1.17
- All previously marked deprecated metrics will be removed from the codebase.
- Metrics can be marked as stable.
- Stable release target 1.18
- First release cycle in which stable metrics may be deprecated as per the new stability guidelines.