Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Composite HealthIndicator that runs them in parallel #2652

Open
snicoll opened this issue Mar 14, 2015 · 22 comments
Open

Composite HealthIndicator that runs them in parallel #2652

snicoll opened this issue Mar 14, 2015 · 22 comments
Labels
type: enhancement A general enhancement

Comments

@snicoll
Copy link
Member

snicoll commented Mar 14, 2015

Right now, invoking the health endpoint lead to a synchronous invocation of all underlying HealthIndicator instance(s) that have been configured. It would be nice to offer an option to run these in parallel instead.

@philwebb philwebb added this to the 1.3.0 milestone Mar 16, 2015
@philwebb philwebb added the type: enhancement A general enhancement label Mar 16, 2015
@philwebb philwebb removed this from the 1.3.0 milestone Mar 16, 2015
@philwebb
Copy link
Member

Definitely in the "nice to have" bucket. I imaging that the health endpoint isn't hit all that often so performance probably isn't a big deal.

@snicoll
Copy link
Member Author

snicoll commented Mar 16, 2015

@philwebb I had the idea when I saw #2630

@fatroom
Copy link

fatroom commented Oct 22, 2015

I would also like to have support for this. My project got a bunch of health checks and some of them can run up to few seconds. So in total performance of /health endpoint is terrible with sequential approach.

@fjcano
Copy link

fjcano commented Oct 22, 2015

👍
Some of my projects have 20+ health checks.
For now we use our homebrew solution to do them in parallel.

Would love to see support for parallel health checks in Spring Boot out of the box.
Or at least a way to inject our own CompositeHealthIndicator.

@prismec
Copy link

prismec commented Sep 12, 2017

And it would be nice to include a standard detail information how long a certain health indicator took to execute.

@philwebb philwebb changed the title composite HealthIndicator that runs them in parallel Composite HealthIndicator that runs them in parallel Mar 22, 2018
@philwebb philwebb added this to the Icebox milestone Mar 22, 2018
@erikgreif-acc
Copy link

Real-world example here:

Some services have requirements to respond within a given time limit. This time limit may be set by an external organization, not always under the control of the devs. This kind of timeout gets interpreted as a DOWN state, wakes devs up in the middle of the night, reboots service instances, and all the while provides no context as to which component was experiencing high latency (because there was no HTTP response).

I have successfully added timeouts per-actuator, but with 20 checks that require timeouts of around 5 seconds (any less would cause alerts for minor network blips), we are asking our caller to wait up to 1:40 for a response. At my org, this is absurd, and nobody wants to permit this behavior.

@rmie
Copy link

rmie commented May 13, 2020

I second everything @erikgreif-acc said.

Sometimes I run into a situation in which a time out occurs but it is unclear which of the components caused it. So I'd like to propose a timeout for the parallel execution after which every unresponsive component is considered/reported as down. To keep the existing behavior the default for this timeout should be infinite.

And while working on it, exporting metrics (number of total, successful and failed executions, sum of execution time for successful executions for each component) would make alerting (and inhibiting) on health related issues easier and would help with dissecting problems faster.

@jim-mclean
Copy link

+1 for this feature, or for some sort of async checking...

I am surprised that this has not yet been implemented. We're running Spring Boot Apps on Pivotal (Tanzu?) Cloud Foundry and the out of the box settings in that environment at to consider a HTTP Healthcheck failed after 1s. For all but the simplest of apps that we have deployed to PCF this is causing unnecessary restarts.

@wilkinsona
Copy link
Member

#25459 is somewhat related to this. I've been wondering about something that runs the health indicators periodically in the background and the health endpoint would then just respond immediately with the current aggregated status.

@jim-mclean
Copy link

jim-mclean commented Mar 4, 2021

@wilkinsona I think providing framework support for running health indicators in the background and providing the latest aggregate state on HTTP calls would solve this problem pretty holistically.

Initially blocking until there is a response from all indicators seams reasonable. As does considering an indicator as being DOWN if its invocation has taken longer than some threshold and/or older than some max age.

Also, it may be best to default to the current behavior, but be able to flag HealthIndicator implementations as beaning async with their timeout.. i.e. @ BackgroundCheck(maxTime=2000, maxAge= 3000) or some other existing more suitable annotation.

@antoinemeyer
Copy link

I ended up creating an implementation of async health indicators in this separate repo: https://github.com/antoinemeyer/spring-boot-async-health-indicator
It relies on a ScheduledThreadPoolExecutor with an annotation @AsyncHealth.

I considered having it only property-driven (management.health.async.indicator.$INDICATOR-BEAN-NAME.refresh-rate) or using a different interface (AsyncHealthIndicator) but believed an annotation would be more adequate.

I would be very happy to have your feedback on the approach taken. If you think this can be integrated directly into spring-boot-actuator, I'd gladly work on a PR.

@philwebb philwebb added the for: team-meeting An issue we'd like to discuss as a team to make progress label Nov 29, 2021
@philwebb philwebb removed the for: team-meeting An issue we'd like to discuss as a team to make progress label Dec 1, 2021
@philwebb
Copy link
Member

philwebb commented Dec 1, 2021

Thanks very much for the pull-request offer @antoinemeyer. It's really great to see that you've been able to add support in an additional library. We're currently not sure if we want to offer async support in the way that you've implemented it, or as suggested in #25459. Unfortunately, we're currently investing most of our time in Spring Boot 3.0 work so it's unlikely that we'll be able to make the decision anytime soon. I think it's best we wait until Spring Boot 3.0 is out before looking at this. That probably means that a pull-request would be a bit premature, and it would be best to consider things again when we're planning Spring Boot 3.1.

@antoinemeyer
Copy link

Thanks @philwebb for the feedback (once again).
I am looking forward to see what solution you will end up adopting for this problem. In the meantime I'll rely on (and improve) that library and hope it can benefit others!

@unnisathya92
Copy link

hi @antoinemeyer, Do you have a working example for the aysnc-health-proj? It would really be helpful

@antoinemeyer
Copy link

hi @antoinemeyer, Do you have a working example for the aysnc-health-proj? It would really be helpful

hi @unnisathya92 ! Sure, here is a starter with asynchronous health indicators: spring-async-health-indicator-demo.zip

@kcshettar
Copy link

+1 for the feature. My application has three health check components.

Component 1: <100ms response time
Component 2: <400ms response time
Component 3: <400ms response time

Would love to see the /actuator/health endpoint to respond within 400ms (worst case).

I am rendering the health details on react application (Home Page) which takes 100ms+400ms+400ms = 900ms right now. 900ms page load time is acceptable right now. It would be nice if the Home Page can load within 400ms.

We could implement the parallel processing logic as a temporary solution. But, I am sure this feature will come out of the box some day under spring-boot-starter-actuator.

@ahoehma
Copy link
Contributor

ahoehma commented Aug 16, 2023

Any news here? In my app I have multiple db's, so multiple db-pools and so multiple health checks.

Now I see something like this:

Health contributor org.springframework.boot.actuate.jdbc.DataSourceHealthIndicator (db/creationCacheDbDataSource) took 269391ms to respond

This is really bad :)

Is there something I can do to pimp my (db) health checks to be little bit more smart regarding resources/runtime etc?

@snicoll
Copy link
Member Author

snicoll commented Aug 16, 2023

@ahoehma this issue represents the latest state.

I am not sure you're looking in the right direction. Running them in parallel should improve things a bit, but certainly not to the extend of what you're experiencing now. In other words, please review your indicator and backend configuration.

@ahoehma
Copy link
Contributor

ahoehma commented Aug 17, 2023

@snicoll thanks for the info. I was just thinking that parallel running (slow) health-check can improve the situation a little bit but at the end you are right. Regarding the configuration of my db-indicators is there maybe somewhere some documentation? Until yet my understanding was that springboot just wrap each running Datasource into a health-indicator. Would be maybe better to use just another datasource to the same db just to keep all configurated pooled connections ready for business or should I just increase the max number of pooled connections to be ready. Also what happens if multiple callers ask the health endpoint ... there is always an on-the-fly health-check as fas as I know and this means if aws elb + other dashboards checking health ... my connections to the db just waisted :) May my whole topic is var to much for a issue ... is there a good place for such a discussion.

Many thanks!

Andreas

@snicoll
Copy link
Member Author

snicoll commented Aug 17, 2023

That, and we don't use the issue tracker for discussions. You may want to chat with the community on Gitter. You may also be interested in endpoint caching.

@tomcruise81
Copy link

tomcruise81 commented Oct 19, 2023

Hopefully this is helpful to someone else here or in #25459. Taking a few queues from https://github.com/antoinemeyer/spring-boot-async-health-indicator (thanks @antoinemeyer), I came up with something for Spring Boot 3 that is not annotation based, but rather continues on in a manner similar to the existing HealthIndicator approach:

AsyncHealthIndicator

import java.time.Duration;
import org.springframework.boot.actuate.health.HealthIndicator;

/**
 * This is an extension to {@link HealthIndicator}, except with the intention of providing similar
 * functionality to <a
 * href="/antoinemeyer/spring-boot-async-health-indicator">antoinemeyer/spring-boot-async-health-indicator</a>.
 * This comes as a result of the following Spring Boot issues still being open:
 *
 * <ul>
 *   <li>https://github.com/spring-projects/spring-boot/issues/2652
 *   <li>https://github.com/spring-projects/spring-boot/issues/25459
 * </ul>
 *
 * <p>This aims to offer a means of getting asynchronously provided and potentially cached {@link
 * org.springframework.boot.actuate.health.Health Health} so as to allow the health actuator
 * endpoint to return almost immediately.
 *
 * @see: https://github.com/antoinemeyer/spring-boot-async-health-indicator
 */
public interface AsyncHealthIndicator extends HealthIndicator {
  public default Duration getHealthCheckFrequency() {
    return Duration.ofSeconds(10);
  }
}

AsyncHealthIndicatorAutoConfiguration

import java.time.Duration;
import java.time.ZonedDateTime;
import javax.annotation.Nullable;
import lombok.Builder;
import lombok.Getter;
import lombok.Setter;
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
import org.springframework.beans.factory.InitializingBean;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.actuate.autoconfigure.health.HealthEndpointAutoConfiguration;
import org.springframework.boot.actuate.health.Health;
import org.springframework.boot.actuate.health.HealthContributorRegistry;
import org.springframework.boot.actuate.health.HealthIndicator;
import org.springframework.boot.actuate.health.NamedContributor;
import org.springframework.boot.autoconfigure.AutoConfigureAfter;
import org.springframework.context.annotation.Configuration;
import org.springframework.scheduling.TaskScheduler;

/**
 * Configuration to allow replacing {@link HealthIndicator} instances in the {@link
 * HealthContributorRegistry} with {@link WrappedHealthIndicator} instances. The {@link
 * WrappedHealthIndicator} will be scheduled to refresh their internal state, and during {@link
 * HealthIndicator#health()} calls, will return their currently cached internal state immediately.
 *
 * <p>This relies on the existence of a {@link TaskScheduler}, which can be enabled by an {@link
 * org.springframework.scheduling.annotation.EnableScheduling @EnableScheduling} annotation
 * somewhere within your application.
 *
 * <p>Lombok can easily be replaced if it's not desired.
 *
 * <p>This is aims to provide similarfunctionality to <a
 * href="/antoinemeyer/spring-boot-async-health-indicator">antoinemeyer/spring-boot-async-health-indicator</a>.
 *
 * @see: https://github.com/antoinemeyer/spring-boot-async-health-indicator
 */
@Configuration
@AutoConfigureAfter(HealthEndpointAutoConfiguration.class)
class AsyncHealthIndicatorAutoConfiguration implements InitializingBean {
  private static final Log log = LogFactory.getLog(AsyncHealthIndicatorAutoConfiguration.class);

  @Autowired private HealthContributorRegistry healthContributorRegistry;

  @Autowired private TaskScheduler taskScheduler;

  @Override
  public void afterPropertiesSet() throws Exception {
    for (NamedContributor<?> namedContributor : healthContributorRegistry) {
      final String name = namedContributor.getName();
      final Object contributor = namedContributor.getContributor();
      if (contributor instanceof AsyncHealthIndicator asyncHealthIndicator) {
        healthContributorRegistry.unregisterContributor(name);
        log.debug(
            "Wrapping " + contributor.getClass().getSimpleName() + " for async health scheduling");
        WrappedHealthIndicator wrappedHealthIndicator =
            WrappedHealthIndicator.builder().wrappedHealthIndicator(asyncHealthIndicator).build();
        healthContributorRegistry.registerContributor(name, wrappedHealthIndicator);
        taskScheduler.scheduleWithFixedDelay(
            wrappedHealthIndicator, asyncHealthIndicator.getHealthCheckFrequency());
      }
    }
  }

  @Getter
  @Setter
  @Builder
  private static class WrappedHealthIndicator implements HealthIndicator, Runnable {
    private static final String LAST_CHECKED_KEY = "lastChecked";
    private static final String LAST_DURATION_KEY = "lastDuration";

    private HealthIndicator wrappedHealthIndicator;

    @Nullable private Health lastHealth;

    @Override
    public Health health() {
      Health lastHealth = getLastHealth();
      if (lastHealth == null) {
        setLastHealth(getAndWrapHealth());
        lastHealth = getLastHealth();
      }

      return lastHealth;
    }

    private Health getAndWrapHealth() {
      ZonedDateTime startTime = ZonedDateTime.now();
      Health baseHealth = getWrappedHealthIndicator().health();
      ZonedDateTime endTime = ZonedDateTime.now();
      Duration duration = Duration.between(startTime, endTime);
      return Health.status(baseHealth.getStatus())
          .withDetails(baseHealth.getDetails())
          .withDetail(LAST_CHECKED_KEY, startTime)
          .withDetail(LAST_DURATION_KEY, duration)
          .build();
    }

    @Override
    public void run() {
      setLastHealth(getAndWrapHealth());
    }
  }
}

Usage is simply replacing existing implements HealthIndicator instances with implements AsyncHealthIndicator

@antoinemeyer
Copy link

antoinemeyer commented Mar 2, 2024

Hopefully this is helpful to someone else here or in #25459

@tomcruise81 FYI I updated spring-boot-async-health-indicator under version boot3-v1.3 for spring boot 3 support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: enhancement A general enhancement
Projects
None yet
Development

No branches or pull requests