Skip to content

Commit

Permalink
📝 document logic for suppressin gabort timeouts
Browse files Browse the repository at this point in the history
Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
  • Loading branch information
joerunde committed Aug 22, 2024
1 parent 2f7d8a6 commit db74ce2
Showing 1 changed file with 10 additions and 2 deletions.
12 changes: 10 additions & 2 deletions vllm/entrypoints/openai/rpc/client.py
Original file line number Diff line number Diff line change
Expand Up @@ -336,8 +336,16 @@ async def _is_tracing_enabled_rpc(self) -> bool:
async def abort(self, request_id: str):
"""Send an ABORT_REQUEST signal to the RPC Server"""

# Suppress timeouts as well- if the server is busy and does not ack in
# time we assume it got the message.
# Suppress timeouts as well.
# In cases where the server is busy processing requests and a very
# large volume of abort requests arrive, it is likely that the server
# will not be able to ack all of them in time. We have seen this when
# we abort 20k requests at once while another 2k are processing- many
# of them time out, but we see the server successfully abort all of the
# requests.
# In this case we assume that the server has received or will receive
# these abort requests, and ignore the timeout. This prevents a massive
# wall of `TimeoutError` stack traces.
with suppress(RPCClientClosedError, TimeoutError):
await self._send_one_way_rpc_request(
request=RPCAbortRequest(request_id),
Expand Down

0 comments on commit db74ce2

Please sign in to comment.