✨ Adopt native histograms #3165

krisztianfekete · 2025-03-18T09:59:52Z

Adopts Native Histograms in an opt-in and backward-compatible manner to overcome the limitations of traditional histograms.

This solves #3164

k8s-ci-robot · 2025-03-18T10:00:02Z

Welcome @krisztianfekete!

It looks like this is your first PR to kubernetes-sigs/controller-runtime 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/controller-runtime has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

k8s-ci-robot · 2025-03-18T10:00:02Z

Hi @krisztianfekete. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

alvaroaleman · 2025-03-18T22:31:44Z

pkg/internal/controller/metrics/metrics.go

 			1.25, 1.5, 1.75, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 40, 50, 60},
+		NativeHistogramBucketFactor:     1.1,
+		NativeHistogramMaxBucketNumber:  100,
+		NativeHistogramMinResetDuration: 1 * time.Hour,


If we do this, can we please do it for all histograms and not just one? And what was the reasoning behind choosing that NativeHistogramBucketFactor and NativeHistogramMaxBucketNumber?

Sure, let me also add it for workqueue metrics too!
The values I've set are sensible defaults that most project starts with (us included).

Please see the following references, reasoning:

NativeHistogramBucketFactor: https://github.com/prometheus/client_golang/blob/331dfab0cc853dca0242a0d96a80184087a80c1d/prometheus/histogram.go#L405-L429

NativeHistogramMaxBucketNumber: I've seen this being set to 100 usually, unless it is desired to mimic Exponential Histograms in OTel closer, in which case this can also be 160.

NativeHistogramMinResetDuration: "If at least NativeHistogramMinResetDuration has passed since the last reset of the histogram (which includes the creation of the histogram), the whole histogram is reset, i.e. all buckets are deleted and the sum and count of observations as well as the zero bucket are set to zero. Prometheus handles this as a normal counter reset, which means that some observations will be lost between scrapes, so resetting should happen rarely compared to the scraping interval. Additionally, frequent counter resets might lead to less efficient storage in the TSDB (see the TSDB section for details). A NativeHistogramMinResetDuration of one hour is a value that should work well in most situations." source: https://prometheus.io/docs/specs/native_histograms/#limiting-the-bucket-count

There's one more:

controller-runtime/pkg/webhook/internal/metrics/metrics.go

Line 31 in 12c19f7

RequestLatency = prometheus.NewHistogramVec(

Hah, not sure how could I miss that. Thanks, just pushed a commit!

alvaroaleman

/hold

Thanks, makes sense to me.

@sbueringer any concerns here?

k8s-ci-robot · 2025-03-21T01:57:10Z

LGTM label has been added.

Git tree hash: 980ecec5bd1e7ae43a30399ea1e0038b9037ab8b

alvaroaleman · 2025-03-21T01:57:15Z

/ok-to-test

sbueringer · 2025-03-21T09:27:50Z

I'll take a look

krisztianfekete · 2025-03-21T12:26:16Z

Thanks, @sbueringer, looking forward to the review!

sbueringer

Took a closer look and tested it with Prometheus / Grafana / Cluster API. Looks nice!

One smaller finding

sbueringer · 2025-03-21T14:33:21Z

pkg/internal/controller/metrics/metrics.go

 			1.25, 1.5, 1.75, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 40, 50, 60},
+		NativeHistogramBucketFactor:     1.1,
+		NativeHistogramMaxBucketNumber:  100,
+		NativeHistogramMinResetDuration: 1 * time.Hour,


There's one more:

controller-runtime/pkg/webhook/internal/metrics/metrics.go

Line 31 in 12c19f7

RequestLatency = prometheus.NewHistogramVec(

sbueringer · 2025-03-21T15:42:36Z

Thank you!

/lgtm
/approve
/hold cancel

k8s-ci-robot · 2025-03-21T15:42:44Z

LGTM label has been added.

Git tree hash: a6e655220d13e99e4fbfa57657f9d60347d788fd

k8s-ci-robot · 2025-03-21T15:42:44Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alvaroaleman, krisztianfekete, sbueringer

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [alvaroaleman,sbueringer]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

* adopt native histograms * add native histograms to workqueue metrics too * adopt native histograms for admission histogram

adopt native histograms

ef1f8b0

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Mar 18, 2025

k8s-ci-robot requested review from alvaroaleman and varshaprasad96 March 18, 2025 10:00

k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Mar 18, 2025

k8s-ci-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Mar 18, 2025

alvaroaleman reviewed Mar 18, 2025

View reviewed changes

add native histograms to workqueue metrics too

d5a6f86

k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Mar 19, 2025

krisztianfekete requested a review from alvaroaleman March 19, 2025 08:25

alvaroaleman approved these changes Mar 21, 2025

View reviewed changes

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 21, 2025

k8s-ci-robot assigned alvaroaleman Mar 21, 2025

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 21, 2025

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 21, 2025

k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Mar 21, 2025

sbueringer reviewed Mar 21, 2025

View reviewed changes

adopt native histograms for admission histogram

901102a

k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 21, 2025

k8s-ci-robot requested a review from alvaroaleman March 21, 2025 15:24

k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Mar 21, 2025

krisztianfekete requested a review from sbueringer March 21, 2025 15:25

sbueringer added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Mar 21, 2025

k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 21, 2025

k8s-ci-robot assigned sbueringer Mar 21, 2025

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 21, 2025

k8s-ci-robot merged commit 53c0518 into kubernetes-sigs:main Mar 21, 2025
11 checks passed

krisztianfekete deleted the adopt-native-histograms branch March 21, 2025 15:48

alvaroaleman mentioned this pull request Mar 22, 2025

Adopt Native Histograms #3164

Closed

alvaroaleman linked an issue Mar 22, 2025 that may be closed by this pull request

Adopt Native Histograms #3164

Closed

This was referenced May 5, 2025

🌱metrics: Expose client-go metrics in the metrics server #3205

Closed

Increase the number and granularity of buckets for some of kube-scheduler's alpha metrics. Make exporting flag-gated kubernetes/kubernetes#131551

Closed

godwinpang pushed a commit to godwinpang/controller-runtime that referenced this pull request May 23, 2025

✨ Adopt native histograms (kubernetes-sigs#3165)

1c51d61

* adopt native histograms * add native histograms to workqueue metrics too * adopt native histograms for admission histogram

plkokanov mentioned this pull request Jun 3, 2025

Upgrade k8s.io/* to v0.33, sigs.k8s.io/controller-runtime to v0.21 gardener/gardener#12158

Closed

✨ Adopt native histograms #3165

✨ Adopt native histograms #3165

Uh oh!

Conversation

krisztianfekete commented Mar 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

k8s-ci-robot commented Mar 18, 2025

Uh oh!

k8s-ci-robot commented Mar 18, 2025

Uh oh!

alvaroaleman Mar 18, 2025

Choose a reason for hiding this comment

Uh oh!

krisztianfekete Mar 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sbueringer Mar 21, 2025

Choose a reason for hiding this comment

Uh oh!

krisztianfekete Mar 21, 2025

Choose a reason for hiding this comment

Uh oh!

alvaroaleman left a comment

Choose a reason for hiding this comment

Uh oh!

k8s-ci-robot commented Mar 21, 2025

Uh oh!

alvaroaleman commented Mar 21, 2025

Uh oh!

sbueringer commented Mar 21, 2025

Uh oh!

krisztianfekete commented Mar 21, 2025

Uh oh!

sbueringer left a comment

Choose a reason for hiding this comment

Uh oh!

sbueringer Mar 21, 2025

Choose a reason for hiding this comment

Uh oh!

sbueringer commented Mar 21, 2025

Uh oh!

k8s-ci-robot commented Mar 21, 2025

Uh oh!

k8s-ci-robot commented Mar 21, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

krisztianfekete commented Mar 18, 2025 •

edited

Loading

krisztianfekete Mar 19, 2025 •

edited

Loading