Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[V1] [5/N] API Server: unify Detokenizer and EngineCore input #11545

Merged
merged 16 commits into from
Dec 28, 2024
Prev Previous commit
Next Next commit
update comment
  • Loading branch information
robertgshaw2-redhat committed Dec 27, 2024
commit aefeb8499754785ef818694404063b2cd9f6a90e
6 changes: 2 additions & 4 deletions vllm/v1/engine/async_llm.py
Original file line number Diff line number Diff line change
Expand Up @@ -285,10 +285,8 @@ async def abort(self, request_id: str) -> None:
await self.engine_core.abort_requests_async(request_ids)
self.detokenizer.abort_requests(request_ids)

# If a request is finished while we await above,
# then it is possible that the request is already
# removed from the queues, so we do nothing if the
# request_id is no longer in the tracked queues.
# If a request finishes while we await then the request_id
# will be removed from the tracked queues before we get here.
if request_id in self.rid_to_queue:
del self.rid_to_queue[request_id]

Expand Down
Loading