-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add HighCardinalityTagsDetector #3047
Add HighCardinalityTagsDetector #3047
Conversation
Just a thought/question I had while looking at your draft: There is the
Another random thought: Should we give the users the ability to implement a custom strategy? The micrometer default could be logging (like in your draft) but some users may want to be more strict and e.g. throw a runtime exception? Thoughts @jonatan-ivanov ? |
@wengertj Yes, I think it can be used for this purposes. |
Fair points 👍 Related to that: I think that it would be nice if we had an easy to discover API to enable or disable the detector (in addition to documentation). Maybe something along the lines of: Metrics.globalRegistry.withDefaultHighCardinalityTagsDetector() EDIT: You already mentioned that yourself "We might want this to be configurable from the If we stick to the polling solution we might consider implementing |
👍🏼 for |
I did some measurements with a Spring Boot based web service (resourceater) that creates meters with high cardinality tags. EnvironmentJava 18, macOS (x86_64)
CodeYou can see how meters are created in MeasurementsThe scenario was alway the same for all of the measurements:
So in the heap graphs you see some baseline, then a drop: first GC to get the real baseline, then a slope: creating meters, then another drop (second GC to see the real utilization). The amount of heap used by the meters is basically the difference between the baseline after the drops. I did not capture a heap dump or checked the exact numbers, I only wanted to know the approximate usage. 1 million CountersAs you can see creating 1 million Counters occupied a little less than 300MiB from heap. 1 million TimersAs you can see creating 1 million Timers occupied approximately 550MiB from the heap. 500k Counters plus 500k TimersAs you can see creating 500k Counters and 500k Timers occupied approximately 400MiB from heap. 2 million Timers
1.5 million TimersAs you can see creating 1.5 million Timers occupied approximately 800MiB from the heap. I think based on these measurements, setting the default threshold to 1M for the |
micrometer-core/src/main/java/io/micrometer/core/instrument/HighCardinalityTagsDetector.java
Show resolved
Hide resolved
I'm thinking if this should be opt-in or opt-out. Right now it is opt-in. |
micrometer-core/src/main/java/io/micrometer/core/instrument/HighCardinalityTagsDetector.java
Outdated
Show resolved
Hide resolved
Since long ago, I've wanted to have some warning if memory used by meters exceeds a configurable % of heap memory. We could do something like that here, but maybe it doesn't make as much sense here and we should consider that separately. It just seems like we would want to warn about high cardinality before an OOM happens. The problem with an absolute threshold is that for low memory applications it will be too high. For large, monolithic applications with few instances, maybe it will be too low. It is configurable, so it probably isn't a problem. I'm just thinking how we can have a better default. |
micrometer-core/src/main/java/io/micrometer/core/instrument/HighCardinalityTagsDetector.java
Outdated
Show resolved
Hide resolved
micrometer-core/src/main/java/io/micrometer/core/instrument/HighCardinalityTagsDetector.java
Outdated
Show resolved
Hide resolved
micrometer-core/src/main/java/io/micrometer/core/instrument/HighCardinalityTagsDetector.java
Outdated
Show resolved
Hide resolved
I think we can prepare this for that use-case too, e.g.: renaming it to something more general:
I'm going to check if I can figure out some usable heuristics and make the default a function of the max heap. |
I added a simple heuristics for the threshold based on the max heap. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good. Just one comment about the default threshold calculation based on the heap.
Draft solution for detecting high cardinality tags: