Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Fix tool call finish reason in streaming case #9209

Merged
merged 7 commits into from
Oct 12, 2024
Prev Previous commit
Next Next commit
fix formatting
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
  • Loading branch information
maxdebayser committed Oct 10, 2024
commit 70bc2b873080467e7608fa3da715e89feb62efe5
9 changes: 5 additions & 4 deletions vllm/entrypoints/openai/serving_chat.py
Original file line number Diff line number Diff line change
Expand Up @@ -539,8 +539,10 @@ async def chat_completion_stream_generator(
# matched by partial json parsing
# only happens if we are NOT using guided decoding
if tool_parser:
tools_called = len(tool_parser.prev_tool_call_arr) > 0
index = len(tool_parser.prev_tool_call_arr) - 1 if tools_called else 0
tools_called = len(
tool_parser.prev_tool_call_arr) > 0
index = len(tool_parser.prev_tool_call_arr
) - 1 if tools_called else 0
tools_called = index > 0
else:
index = 0
Expand Down Expand Up @@ -577,8 +579,7 @@ async def chat_completion_stream_generator(
delta=delta_message,
logprobs=logprobs,
finish_reason=output.finish_reason
if not tools_called
else "tool_calls",
if not tools_called else "tool_calls",
stop_reason=output.stop_reason)
chunk = ChatCompletionStreamResponse(
id=request_id,
Expand Down
Loading