Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add graceful shutdown to wasmtime serve, fix flaky CI #10394

Merged

Conversation

alexcrichton
Copy link
Member

This commit fixes the spurious CI failure found here. The problem here is that the infrastructure for testing wasmtime serve was not properly waiting for all output in finish. There's some infrastructure in the tests for spawning a subprocess and managing it, but prior to this commit there was no way to shut down the server gracefully and thus read all pending output from the child tasks.

The specific problem is that a specific error message was expected in the logs after a request had been processed, but the finish method wasn't reading the message. The reason for this is that finish had to resort to kill -9 on the child process as there was no other means of shutting it down. This meant that the print, which happened after request completion, might be killed and never happen.

The solution ended up in this commit is to (a) add a --shutdown-addr CLI flag to wasmtime serve and (b) beef up handling of graceful shutdown. Previously ctrl-c was used to exit the server but it didn't do anything but drop all in-progress work. Instead graceful shutdown is now handled by breaking out of the accept loop and then waiting for all child tasks to complete, meaning that no http requests once received are cancelled. In addition to ctrl-c the --shutdown-addr is used to listen for a TCP connection which is a second means of cancellation. The reason for this is that sending ctrl-c to a process is not nearly as trivial on Windows as it is on Unix, so I didn't want to deal with all the console bits necessary to get that aligned.

This commit fixes the spurious CI failure found [here][fail]. The
problem here is that the infrastructure for testing `wasmtime serve` was
not properly waiting for all output in `finish`. There's some
infrastructure in the tests for spawning a subprocess and managing it,
but prior to this commit there was no way to shut down the server
gracefully and thus read all pending output from the child tasks.

The specific problem is that a specific error message was expected in
the logs after a request had been processed, but the `finish` method
wasn't reading the message. The reason for this is that `finish` had to
resort to `kill -9` on the child process as there was no other means of
shutting it down. This meant that the print, which happened after
request completion, might be killed and never happen.

The solution ended up in this commit is to (a) add a `--shutdown-addr`
CLI flag to `wasmtime serve` and (b) beef up handling of graceful
shutdown. Previously ctrl-c was used to exit the server but it didn't do
anything but drop all in-progress work. Instead graceful shutdown is now
handled by breaking out of the `accept` loop and then waiting for all
child tasks to complete, meaning that no http requests once received are
cancelled. In addition to ctrl-c the `--shutdown-addr` is used to listen
for a TCP connection which is a second means of cancellation. The reason
for this is that sending ctrl-c to a process is not nearly as trivial on
Windows as it is on Unix, so I didn't want to deal with all the console
bits necessary to get that aligned.

[fail]: https://github.com/bytecodealliance/wasmtime/actions/runs/13833291117/job/38702374924?pr=10390
@alexcrichton alexcrichton requested a review from a team as a code owner March 13, 2025 19:13
@alexcrichton alexcrichton requested review from dicej and removed request for a team March 13, 2025 19:13
@alexcrichton alexcrichton enabled auto-merge March 13, 2025 23:07
@alexcrichton alexcrichton added this pull request to the merge queue Mar 13, 2025
Merged via the queue into bytecodealliance:main with commit a3381e4 Mar 13, 2025
40 checks passed
@alexcrichton alexcrichton deleted the graceful-shutdown-serve branch March 13, 2025 23:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants