Skip to content

[V1][Feat] Fail request if FSM fails to advance #18780

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 9 additions & 2 deletions vllm/v1/core/sched/scheduler.py
Original file line number Diff line number Diff line change
Expand Up @@ -774,13 +774,20 @@
# the outer lists can be of length > 1.
new_logprobs = logprobs.slice(req_index, req_index + 1)

if new_token_ids and self.structured_output_manager.should_advance(
request):
# NOTE: structured_output_request
# should not be None if use_structured_output, we have
# check above, so safe to ignore type warning
request.structured_output_request.grammar.accept_tokens( # type: ignore[union-attr]
req_id, new_token_ids)
if not request.structured_output_request.grammar.accept_tokens( # type: ignore[union-attr]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's also add a note and create a bug for tracking this here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's an issue #18783

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how does that look @aarnphm

req_id, new_token_ids):

Check failure on line 783 in vllm/v1/core/sched/scheduler.py

View workflow job for this annotation

GitHub Actions / pre-commit

Ruff (SIM102)

vllm/v1/core/sched/scheduler.py:777:13: SIM102 Use a single `if` statement instead of nested `if` statements
Comment on lines +782 to +783
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest using a variable here to make things a bit clearer:

Suggested change
if not request.structured_output_request.grammar.accept_tokens( # type: ignore[union-attr]
req_id, new_token_ids):
accepted = request.structured_output_request.grammar.accept_tokens( # type: ignore[union-attr]
req_id, new_token_ids)
if not accepted:

# Grammar FSM failed to advance - mark request as finished with error

Check failure on line 784 in vllm/v1/core/sched/scheduler.py

View workflow job for this annotation

GitHub Actions / pre-commit

Ruff (E501)

vllm/v1/core/sched/scheduler.py:784:81: E501 Line too long (89 > 80)
logger.error(
"Structured output FSM failed to advance for request %s. "

Check failure on line 786 in vllm/v1/core/sched/scheduler.py

View workflow job for this annotation

GitHub Actions / pre-commit

Ruff (E501)

vllm/v1/core/sched/scheduler.py:786:81: E501 Line too long (82 > 80)
"Terminating request.", req_id)
request.status = RequestStatus.FINISHED_ABORTED
stopped = True
self._free_request(request)

# Add newly generated spec token ids to the request.
if spec_token_ids is not None:
Expand Down
Loading