Skip to content

feat: Add graceful shutdown timer to GRPC frontend #7969

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 43 commits into from
Apr 26, 2025

Conversation

mattwittwer
Copy link
Contributor

@mattwittwer mattwittwer commented Jan 27, 2025

What does the PR do?

This PR adds a shutdown timer to the gRPC endpoint for both infer and streaming infer requests. Inflight requests will be allowed to complete before the timer expires and new requests made after shutdown has started will be rejected.

Checklist

  • PR title reflects the change and is of format <commit_type>: <Title>
  • Changes are described in the pull request.
  • Related issues are referenced.
  • Populated github labels field
  • Added test plan and verified test passes.
  • Verified that the PR passes existing CI.
  • Verified copyright is correct on all changed files.
  • Added succinct git squash message before merging ref.
  • All template sections are filled out.
  • Optional: Additional screenshots for behavior/output changes with before/after.

Commit Type:

Check the conventional commit type
box here and add the label to the github PR.

  • build
  • ci
  • docs
  • feat
  • fix
  • perf
  • refactor
  • revert
  • style
  • test

Related PRs:

N/A

Where should the reviewer start?

Start with the new functions in src/grpc/grpc_server.cc and the flow in src/main.cc
GracefulStop();
unload the models in core
Stop();

Test plan:

Review L0_lifecycle and grpc related tests.
Updated shutdown behavior changes the tested responses for some lifecycle tests

  • CI Pipeline ID: 27245783

Caveats:

The updated shutdown process includes three responses. The CANCELLED state cannot be removed as it is part of the GRPC endpoint shutdown behavior:

  • "GRPC server is shutting down and has stopped accepting new requests."
  • "CANCELLED"
  • "failed to connect to all addresses; last error: UNKNOWN: ipv4:127.0.0.1:8001: Failed to connect to remote host: connect: Connection refused (111)"

Background

Shutdown behavior for the gRPC endpoint could lead to unexpected errors when unloading the models from the core

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

N/A

@mattwittwer mattwittwer added enhancement New feature or request grpc Related to the GRPC server labels Jan 27, 2025
@mattwittwer mattwittwer self-assigned this Jan 27, 2025
@statiraju statiraju requested a review from rmccorm4 January 27, 2025 20:05
@rmccorm4 rmccorm4 marked this pull request as draft January 28, 2025 00:16
@rmccorm4 rmccorm4 changed the title draft: mwittwer/grpc shutdown timer draft: feat: Add graceful shutdown timer to GRPC frontend Jan 28, 2025
@mattwittwer mattwittwer requested a review from indrajit96 March 3, 2025 17:52
@mattwittwer mattwittwer changed the title draft: feat: Add graceful shutdown timer to GRPC frontend feat: Add graceful shutdown timer to GRPC frontend Mar 3, 2025
@mattwittwer mattwittwer marked this pull request as ready for review March 3, 2025 18:48
@rmccorm4 rmccorm4 requested review from krishung5 and tanmayv25 March 4, 2025 23:02
Copy link
Contributor

@indrajit96 indrajit96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Let's create a TEP for this as discussed.
  2. Test a simple no-model perf with and without changes to see if any perf regression was introduced

@mattwittwer mattwittwer requested a review from pskiran1 March 31, 2025 18:32
Copy link
Contributor

@indrajit96 indrajit96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, make sure you attach perf numbers to the ticket and the update the latest CI pipeline on the reviews thread.
Nice work!

@mattwittwer mattwittwer merged commit 5a2aaba into main Apr 26, 2025
3 checks passed
@mattwittwer mattwittwer deleted the mwittwer/gRPC-shutdown-timer branch April 26, 2025 19:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request grpc Related to the GRPC server
Development

Successfully merging this pull request may close these issues.

5 participants