-
Notifications
You must be signed in to change notification settings - Fork 40.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Health indicators based on Service Level Objectives #21311
base: main
Are you sure you want to change the base?
Conversation
Open questionsWe build health indicators with Some of the SLOs are a combination of two or more indicators. For example, in "jvmTotalMemory": {
"status": "UP",
"details": {
"someTag": "someValue"
},
"components": {
"jvmGcOverhead": {
"status": "UP",
"details": {
"value": "0.01%",
"mustBe": "<20%",
"unit": "percent CPU time spent"
}
},
"jvmMemoryConsumption": {
"status": "UP",
"details": {
"value": "9.09%",
"mustBe": "<90%",
"unit": "maximum percent used in last 5 minutes"
}
}
}
} |
220c8ba
to
d907ba5
Compare
Thanks @jkschneider! I'll target this for 2.4.x so we remember to take a look as soon the 2.3.0 release crunch is over. |
We haven't had a chance to take a look at this change, nor upgrade to Micrometer 1.6. |
@snicoll and I discussed this today. There are a few things that came up:
Flagging for team-meeting so that we can discuss this on the next team call. |
We discussed this some more as a team today and our feeling is that we're not sure that we have a strong enough opinion to auto-configure SLOs has health indicators. We can see that it may make sense for some users but not for others. For example, in some cases, a proxy will already be aware of the error rate for requests that it routes to an application instance. In this case, exposing the information via a health endpoint that it will also be monitoring will be of minimal value, and may even be harmful depending on how things behave when the application's health changes. For users that do want to expose SLOs as health indicators, we could provide some classes that make it easier to do so. Since this proposal was made, we've also introduced the concept of application state. It may be that some users want to configure things such that an unmet objective results in a change to the application state to indicate that it's no longer ready, for example. We could provide some helper classes that a user can configure to connect SLOs to application state. We discussed possibly auto-configuring the Overall, our feeling was that we would stop short of anything that exposes the SLOs externally, instead auto-configuring the @shakuzen @jonatan-ivanov Could we have your input here please? Are we right to be cautious and just give users the parts they need and leave them to join things together or is there some clearly established usage of |
This comment has been minimized.
This comment has been minimized.
1ca278f
to
902dd0b
Compare
This feature adds support for commonly requested functionality for an application to be able to aggregate some set of metrics key performance indicators down to a health indicator.
I fully expect some changes, probably significant changes, based on feedback iterations on this, but want to offer this up early in the 2.4.0 release iteration so we have time to iterate and also dogfood any autoconfigured service level objectives.
Some indicators are known to be broadly applicable to a wide range of Java applications, and those could be autoconfigured. An example of a set of such indicators is defined here and autoconfigured by this pull request (
JvmServiceLevelObjectives.MEMORY
).In many cases, users would like to configure a load balancer to avoid instances that are failing a key performance indicator by configuring an HTTP health check on the load balancer. In fact, some applications may already be doing this for the health indicators Spring Boot or users already provide. Example platform load balancer configurations that can be pointed to
/actuator/health
:See micrometer-metrics/micrometer#2055 for more detail.
The
HealthMeterRegistry
As of 1.6.0, Micrometer has a new implementation:
micrometer-registry-health
. An autoconfiguration was added tospring-boot-actuator-autoconfigure
for this new implementation.Any
@Bean ServiceLevelObjective
is configured onto theHealthMeterRegistry
and bound as a Spring BootHealthIndicator
.What it looks like in
/actuator/health
About
ServiceLevelObjective
Service level objectives broadly have the following capabilities:
HealthMeterRegistry
.MeterBinder
that contain the measurements that they need to determine availability.Health#details
map, respectively.API error ratio property-driven configuration
The above properties result in two service level objective health indicators called
apiErrorRatioApiCustomer
andapiErrorRatioAdmin
, which check for aSERVER_ERROR
outcome to total throughput ratio of less than 1% for requests to paths starting with/api/customer
and 2% for requests to paths starting with/admin
, respectively.