-
Notifications
You must be signed in to change notification settings - Fork 40.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Composite HealthIndicator that runs them in parallel #2652
Comments
Definitely in the "nice to have" bucket. I imaging that the |
I would also like to have support for this. My project got a bunch of health checks and some of them can run up to few seconds. So in total performance of /health endpoint is terrible with sequential approach. |
👍 Would love to see support for parallel health checks in Spring Boot out of the box. |
And it would be nice to include a standard detail information how long a certain health indicator took to execute. |
Real-world example here: Some services have requirements to respond within a given time limit. This time limit may be set by an external organization, not always under the control of the devs. This kind of timeout gets interpreted as a DOWN state, wakes devs up in the middle of the night, reboots service instances, and all the while provides no context as to which component was experiencing high latency (because there was no HTTP response). I have successfully added timeouts per-actuator, but with 20 checks that require timeouts of around 5 seconds (any less would cause alerts for minor network blips), we are asking our caller to wait up to 1:40 for a response. At my org, this is absurd, and nobody wants to permit this behavior. |
I second everything @erikgreif-acc said. Sometimes I run into a situation in which a time out occurs but it is unclear which of the components caused it. So I'd like to propose a timeout for the parallel execution after which every unresponsive component is considered/reported as down. To keep the existing behavior the default for this timeout should be infinite. And while working on it, exporting metrics (number of total, successful and failed executions, sum of execution time for successful executions for each component) would make alerting (and inhibiting) on health related issues easier and would help with dissecting problems faster. |
+1 for this feature, or for some sort of async checking... I am surprised that this has not yet been implemented. We're running Spring Boot Apps on Pivotal (Tanzu?) Cloud Foundry and the out of the box settings in that environment at to consider a HTTP Healthcheck failed after 1s. For all but the simplest of apps that we have deployed to PCF this is causing unnecessary restarts. |
#25459 is somewhat related to this. I've been wondering about something that runs the health indicators periodically in the background and the health endpoint would then just respond immediately with the current aggregated status. |
@wilkinsona I think providing framework support for running health indicators in the background and providing the latest aggregate state on HTTP calls would solve this problem pretty holistically. Initially blocking until there is a response from all indicators seams reasonable. As does considering an indicator as being DOWN if its invocation has taken longer than some threshold and/or older than some max age. Also, it may be best to default to the current behavior, but be able to flag HealthIndicator implementations as beaning async with their timeout.. i.e. @ BackgroundCheck(maxTime=2000, maxAge= 3000) or some other existing more suitable annotation. |
I ended up creating an implementation of async health indicators in this separate repo: https://github.com/antoinemeyer/spring-boot-async-health-indicator I considered having it only property-driven ( I would be very happy to have your feedback on the approach taken. If you think this can be integrated directly into spring-boot-actuator, I'd gladly work on a PR. |
Thanks very much for the pull-request offer @antoinemeyer. It's really great to see that you've been able to add support in an additional library. We're currently not sure if we want to offer async support in the way that you've implemented it, or as suggested in #25459. Unfortunately, we're currently investing most of our time in Spring Boot 3.0 work so it's unlikely that we'll be able to make the decision anytime soon. I think it's best we wait until Spring Boot 3.0 is out before looking at this. That probably means that a pull-request would be a bit premature, and it would be best to consider things again when we're planning Spring Boot 3.1. |
Thanks @philwebb for the feedback (once again). |
hi @antoinemeyer, Do you have a working example for the aysnc-health-proj? It would really be helpful |
hi @unnisathya92 ! Sure, here is a starter with asynchronous health indicators: spring-async-health-indicator-demo.zip |
+1 for the feature. My application has three health check components. Component 1: <100ms response time Would love to see the I am rendering the health details on react application (Home Page) which takes 100ms+400ms+400ms = 900ms right now. 900ms page load time is acceptable right now. It would be nice if the Home Page can load within 400ms. We could implement the parallel processing logic as a temporary solution. But, I am sure this feature will come out of the box some day under |
Any news here? In my app I have multiple db's, so multiple db-pools and so multiple health checks. Now I see something like this: Health contributor org.springframework.boot.actuate.jdbc.DataSourceHealthIndicator (db/creationCacheDbDataSource) took 269391ms to respond This is really bad :) Is there something I can do to pimp my (db) health checks to be little bit more smart regarding resources/runtime etc? |
@ahoehma this issue represents the latest state. I am not sure you're looking in the right direction. Running them in parallel should improve things a bit, but certainly not to the extend of what you're experiencing now. In other words, please review your indicator and backend configuration. |
@snicoll thanks for the info. I was just thinking that parallel running (slow) health-check can improve the situation a little bit but at the end you are right. Regarding the configuration of my db-indicators is there maybe somewhere some documentation? Until yet my understanding was that springboot just wrap each running Datasource into a health-indicator. Would be maybe better to use just another datasource to the same db just to keep all configurated pooled connections ready for business or should I just increase the max number of pooled connections to be ready. Also what happens if multiple callers ask the health endpoint ... there is always an on-the-fly health-check as fas as I know and this means if aws elb + other dashboards checking health ... my connections to the db just waisted :) May my whole topic is var to much for a issue ... is there a good place for such a discussion. Many thanks! Andreas |
That, and we don't use the issue tracker for discussions. You may want to chat with the community on Gitter. You may also be interested in endpoint caching. |
Hopefully this is helpful to someone else here or in #25459. Taking a few queues from https://github.com/antoinemeyer/spring-boot-async-health-indicator (thanks @antoinemeyer), I came up with something for Spring Boot 3 that is not annotation based, but rather continues on in a manner similar to the existing AsyncHealthIndicatorimport java.time.Duration;
import org.springframework.boot.actuate.health.HealthIndicator;
/**
* This is an extension to {@link HealthIndicator}, except with the intention of providing similar
* functionality to <a
* href="/antoinemeyer/spring-boot-async-health-indicator">antoinemeyer/spring-boot-async-health-indicator</a>.
* This comes as a result of the following Spring Boot issues still being open:
*
* <ul>
* <li>https://github.com/spring-projects/spring-boot/issues/2652
* <li>https://github.com/spring-projects/spring-boot/issues/25459
* </ul>
*
* <p>This aims to offer a means of getting asynchronously provided and potentially cached {@link
* org.springframework.boot.actuate.health.Health Health} so as to allow the health actuator
* endpoint to return almost immediately.
*
* @see: https://github.com/antoinemeyer/spring-boot-async-health-indicator
*/
public interface AsyncHealthIndicator extends HealthIndicator {
public default Duration getHealthCheckFrequency() {
return Duration.ofSeconds(10);
}
} AsyncHealthIndicatorAutoConfigurationimport java.time.Duration;
import java.time.ZonedDateTime;
import javax.annotation.Nullable;
import lombok.Builder;
import lombok.Getter;
import lombok.Setter;
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
import org.springframework.beans.factory.InitializingBean;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.actuate.autoconfigure.health.HealthEndpointAutoConfiguration;
import org.springframework.boot.actuate.health.Health;
import org.springframework.boot.actuate.health.HealthContributorRegistry;
import org.springframework.boot.actuate.health.HealthIndicator;
import org.springframework.boot.actuate.health.NamedContributor;
import org.springframework.boot.autoconfigure.AutoConfigureAfter;
import org.springframework.context.annotation.Configuration;
import org.springframework.scheduling.TaskScheduler;
/**
* Configuration to allow replacing {@link HealthIndicator} instances in the {@link
* HealthContributorRegistry} with {@link WrappedHealthIndicator} instances. The {@link
* WrappedHealthIndicator} will be scheduled to refresh their internal state, and during {@link
* HealthIndicator#health()} calls, will return their currently cached internal state immediately.
*
* <p>This relies on the existence of a {@link TaskScheduler}, which can be enabled by an {@link
* org.springframework.scheduling.annotation.EnableScheduling @EnableScheduling} annotation
* somewhere within your application.
*
* <p>Lombok can easily be replaced if it's not desired.
*
* <p>This is aims to provide similarfunctionality to <a
* href="/antoinemeyer/spring-boot-async-health-indicator">antoinemeyer/spring-boot-async-health-indicator</a>.
*
* @see: https://github.com/antoinemeyer/spring-boot-async-health-indicator
*/
@Configuration
@AutoConfigureAfter(HealthEndpointAutoConfiguration.class)
class AsyncHealthIndicatorAutoConfiguration implements InitializingBean {
private static final Log log = LogFactory.getLog(AsyncHealthIndicatorAutoConfiguration.class);
@Autowired private HealthContributorRegistry healthContributorRegistry;
@Autowired private TaskScheduler taskScheduler;
@Override
public void afterPropertiesSet() throws Exception {
for (NamedContributor<?> namedContributor : healthContributorRegistry) {
final String name = namedContributor.getName();
final Object contributor = namedContributor.getContributor();
if (contributor instanceof AsyncHealthIndicator asyncHealthIndicator) {
healthContributorRegistry.unregisterContributor(name);
log.debug(
"Wrapping " + contributor.getClass().getSimpleName() + " for async health scheduling");
WrappedHealthIndicator wrappedHealthIndicator =
WrappedHealthIndicator.builder().wrappedHealthIndicator(asyncHealthIndicator).build();
healthContributorRegistry.registerContributor(name, wrappedHealthIndicator);
taskScheduler.scheduleWithFixedDelay(
wrappedHealthIndicator, asyncHealthIndicator.getHealthCheckFrequency());
}
}
}
@Getter
@Setter
@Builder
private static class WrappedHealthIndicator implements HealthIndicator, Runnable {
private static final String LAST_CHECKED_KEY = "lastChecked";
private static final String LAST_DURATION_KEY = "lastDuration";
private HealthIndicator wrappedHealthIndicator;
@Nullable private Health lastHealth;
@Override
public Health health() {
Health lastHealth = getLastHealth();
if (lastHealth == null) {
setLastHealth(getAndWrapHealth());
lastHealth = getLastHealth();
}
return lastHealth;
}
private Health getAndWrapHealth() {
ZonedDateTime startTime = ZonedDateTime.now();
Health baseHealth = getWrappedHealthIndicator().health();
ZonedDateTime endTime = ZonedDateTime.now();
Duration duration = Duration.between(startTime, endTime);
return Health.status(baseHealth.getStatus())
.withDetails(baseHealth.getDetails())
.withDetail(LAST_CHECKED_KEY, startTime)
.withDetail(LAST_DURATION_KEY, duration)
.build();
}
@Override
public void run() {
setLastHealth(getAndWrapHealth());
}
}
} Usage is simply replacing existing |
@tomcruise81 FYI I updated spring-boot-async-health-indicator under version |
Right now, invoking the health endpoint lead to a synchronous invocation of all underlying
HealthIndicator
instance(s) that have been configured. It would be nice to offer an option to run these in parallel instead.The text was updated successfully, but these errors were encountered: