[Bugfix] Fix 'no event loop' RuntimeError in MPClientEngineMonitor #24422
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Purpose
This PR fixes a RuntimeError that occurs when an engine worker process dies unexpectedly.
The MPClientEngineMonitor thread, which has no active event loop, calls the BackgroundResources finalizer for cleanup.
This caused self.output_socket._get_loop() to raise a RuntimeError, crashing the main server process.
This PR handles the exception by wrapping the loop retrieval in a try-except-else block.
If getting the loop fails, it now performs a best-effort synchronous cleanup of sockets and logs a warning, preventing the crash and allowing for a controlled shutdown.
Resolves #24230 #24305
Test Plan
Test Result
As-Is
To-Be
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.