-
Notifications
You must be signed in to change notification settings - Fork 40.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Actuator is using CPU while app is doing nothing #11333
Comments
@dgsqrs What version of Spring Boot are you using? Can you share a small sample that shows the issue? |
Hi @philwebb . |
I can reproduce this. After the first request, the CPU seems to be busy: This behaviour seems to be related with the new metrics support, since excluding the autoconfiguration with @jkschneider is this related to the GC pause detection or some specific instrumentation? Is the cost constant even when the application is busy or is this just something to keep the VM busy on purpose? |
Yes, I believe so (and other system-level pauses unrelated to GC).
Yes. I'm not sure if it's possible to improve this mechanism to use no CPU. We could offer a config option to tradeoff monitoring accuracy for CPU consumption, though it's unclear to me what the default should be. More research is in order. Perhaps there is a way to accomplish accurate pause detection and indiscernible CPU consumption. |
My initial feeling is that we should set the default to not consume CPU. I expect that GC pause detection is probably quite an advanced problem for most users and by the time that they need it they'll be quite a long way into their journey with metrics. Constant CPU usage, however, is probably going to get noticed. tldr: I think this should be configurable and by default off. In the docs we should call it out. |
I don't necessarily disagree with the default if the mechanism can't be improved, but this kind of reasoning really troubles me:
Unless we compensate for pauses, they are basically looking at garbage. This lures folks into a false sense of security, the "blindfold with a lollipop" state of being. I saw a study once (conveniently can't remember where) that suggested something like 20% of Java apps exhibited periodic pauses of 5 sec or greater. Even if both of these figures were far lower, it's still frightening. If nothing else, I want the default set of metrics people get to guide them to responsible operationalization, e.g. you should probably run more than one instance in prod and have some circuit breaker mechanism to control SLA boundaries. I understand the temptation to view this as a day 2 problem. It's just not clear to me that many organizations ever even realize there is a problem in the first place. |
At a cursory glance, I'm pretty confident we can implement a pause detector that trades off pause-length granularity for apparent CPU consumption. Then we can select a default that still captures well the eye-popping GC pauses. |
I found that setting the system property It switches LatencyUtils into a test mode such that |
Wow, the hacking prize goes to you today, @osialr :). Thanks for the workaround option. |
@jkschneider do you have a micrometer issue that we can track? |
Testing the fix this morning. Here is the issue: micrometer-metrics/micrometer#328 |
Summarizing the changes coming in Micrometer 1.0.0-rc.6: The pause detector is configurable to an extent via You can change the intervals on the clock-drift detector like so:
We also provide an alternate, noop implementation. Unless you are concerned about even the slightest amount of CPU use, this is not recommended:
It's possible that we'll find alternate pause detector implementations in the future. Perhaps the JDK will someday provide an event-based interface for GC events. Or we could probably provide a detector limited to GC events by scraping GC logs. |
I've tried another round of testing using the latest Spring Boot snapshots; I'm not really seeing an improvement here and even seeing conflicting results. Note that I'm looking at CPU usage on the PID line, which is the amount of CPU used by the process last time it was scheduled on a CPU. So "1%" means "1% on the CPU it was scheduled on", not "1% of the host capacity". This might be biased, but I'm trying to ignore other processes running on the host. ResultsSpring Boot and Actuator, after first request: Spring Boot and Actuator, Spring Boot and Actuator, Spring Boot with Actuator, but Metrics auto-configuration ignored: Summary
There might be something else at play here and we need to look deeper. |
Fixed a bug in |
@jkschneider Is there a SNAPSHOT available that we can try? I'd like to test this before you cut RC8 |
@philwebb Yes. Snapshots are published on every successful build to repo.spring.io |
Thanks @jkschneider. Did you have a PR lined up for this as well? I can't remember what we discussed and if the upgrade is easy or not. |
I've hacked a draft PR here. I'm not sure about a few of those changes, especially in our test suite where we were previously counting measures (if I'm not mistaken). But the test suite is green. |
OK so the update to the snapshot has fixed the issue for me. @dgsqrs if you can give |
Hi there.
If you add the spring-boot-starter-actuator as dependency, after the first http call, the app will consume 6%-7% CPU forever (on a few threads). Even when there is no active http connection nor application background thread working.
If I remove this dependency, the app is idle at 0% as expected.
Edit
Spring Boot version 2.0.0.0M7
Java 8
Archive.zip
The demo app have been generated with https://start.spring.io/
Steps to reproduce the issue
The text was updated successfully, but these errors were encountered: