Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gRPC health checks in kubernetes scenario - unable to execute on demand only subset of healthchecks #2141

Closed
banatm-abb opened this issue May 25, 2023 · 6 comments · Fixed by #2142
Labels
bug Something isn't working
Milestone

Comments

@banatm-abb
Copy link

What version of gRPC and what language are you using?

2.53.0

What operating system (Linux, Windows,...) and version?

macos

What runtime / compiler are you using (e.g. .NET Core SDK version dotnet --info)

6.0.16

What did you do?

Trying to configure kubernetes grpc healthchecks for my service.
I want to have different healthchecks executed on demand for startup and liveness probe.

I've tried to use the following configuration:

        services
            .AddGrpcHealthChecks(o =>
            {
                // startup probe checks dependencies
                o.Services.MapService("startup", r => r.Tags.Contains("startup"));
                // startup probe checks dependencies
                o.Services.MapService( "liveness", r => r.Tags.Contains("liveness"));
                o.Services.MapService("", r => r.Tags.Contains("liveness");
                o.UseHealthChecksCache = false;
            })
            ... //definitions here

           //auto update just liveness ones
           services.Configure<HealthCheckPublisherOptions>(options =>
           {
             options.Predicate = (r) => r.Tags.Contains("liveness");
           });

What did you expect to see?

grpc_cli call IP:PORT grpc.health.v1.Health/Check "service: 'startup'" will run only checks tagged with startup
grpc_cli call IP:PORT grpc.health.v1.Health/Check "service: 'liveness'" will run only checks tagged with liveness

What did you see instead?

grpc_cli call IP:PORT grpc.health.v1.Health/Check "service: 'startup'" doesnt not run startup tagged checks at all!

Code hint

In the newest code I've found that the CheckHealthAsync is executed with different predicate than GetStatus. Is it by design?

        HealthCheckResponse.Types.ServingStatus status;
        if (_grpcHealthCheckOptions.Services.TryGetServiceMapping(service, out var serviceMapping))
        {
            var result = await _healthCheckService.CheckHealthAsync(_healthCheckOptions.Predicate, cancellationToken);
            status = HealthChecksStatusHelpers.GetStatus(result, serviceMapping.Predicate);
        }
@banatm-abb banatm-abb added the bug Something isn't working label May 25, 2023
@JamesNK
Copy link
Member

JamesNK commented May 25, 2023

_healthCheckOptions.Predicate is for filtering the health checks that run. It's null by default so there isn't any filtering.

serviceMapping.Predicate then decides which results are used to calculate the status.

@banatm-abb
Copy link
Author

banatm-abb commented May 25, 2023

Thanks @JamesNK for explanation.

The use case I have is to setup kubernetes liveness and startup probes to use different filters (to check different things - similar expectation was stated in #1963) and have it executed on demand when grpc health-check is called (frequency set in kubernetes).

It seems I cannot configure and control which check is executed when to follow the rule to have 'clever' startup tests (e.g checking external service dependencies) and 'dumb' liveness checks (e.g. checking just grpc stack being responsive).
Some startups checks are heavy - e.g checks if external db is in a proper state / has migrations completed and it is not required to have them executed multiple times.

I suggest to use serviceMapping.Predicate as a filter for _healthCheckService.CheckHealthAsync.
It will enable use cases to have the subset of checks executed only on demand.

@JamesNK
Copy link
Member

JamesNK commented May 25, 2023

HealthCheckPublisherOptions doesn't do anything for you here. You're not calling Watch and you're not caching results.

I don't understand what the problem is. Your first issue said that a health check wasn't running when you expected it to:

grpc_cli call IP:PORT grpc.health.v1.Health/Check "service: 'startup'" doesnt not run startup tagged checks at all!

Now you're saying too many health checks are running:

It seems I cannot configure and control which check is executed when to follow the rule to have 'clever' startup tests (e.g checking external service dependencies) and 'dumb' liveness checks (e.g. checking just grpc stack being responsive).
Some startups checks are heavy - e.g checks if external db is in a proper state / has migrations completed and it is not required to have them executed multiple times.

@banatm-abb
Copy link
Author

I don't understand what the problem is.

The problem is that I cannot limit the checks that are executed in the liveness probe.
All checks runs on always, even if their result is not required to build response status.

@banatm-abb banatm-abb changed the title gRPC health checks in kubernetes scenario - unable to execute on demand only subset of healthchecs gRPC health checks in kubernetes scenario - unable to execute on demand only subset of healthchecks May 25, 2023
@banatm-abb
Copy link
Author

Potential solution could be to:

  • add new predicate in ServiceMapping Func<HealthCheckRegistration, bool> ExecutionPredicate { get; } with default value to return always true
  • use bothservice.ExecutionPredicate and _healthCheckOptions.Predicate in GetHealthCheckResponseAsync
        HealthCheckResponse.Types.ServingStatus status;
        if (_grpcHealthCheckOptions.Services.TryGetServiceMapping(service, out var serviceMapping))
        {
            var result = await _healthCheckService.CheckHealthAsync(
                (r) =>
                    (_healthCheckOptions.Predicate == null || _healthCheckOptions.Predicate(r))
                    && serviceMapping.ExecutionPredicate(r),
                cancellationToken
            );
            status = HealthChecksStatusHelpers.GetStatus(result, serviceMapping.Predicate);
        }

@JamesNK
Copy link
Member

JamesNK commented May 26, 2023

Ok, I understand.

Rather than have two predicates, what I'd like to do is:

  • Obsolete the current predicate on ServiceMapping that takes a HealthResult.
  • Add a new predicate that takes the health registration name, tags, and maybe description. Those values are common between HealthCheckRegistration and HealthResult.
  • The new predicate can be used in both places:
    • The health checks publisher to filter health results
    • CheckHealthAsync to determine what health checks should run for a service name.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants