-
-
Notifications
You must be signed in to change notification settings - Fork 5.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BugFix]: properly deserialize tool_calls
iterator before processing by mistral-common when MistralTokenizer is used
#9951
Conversation
👋 Hi! Thank you for contributing to the vLLM project. Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can do one of these:
🚀 |
34183bb
to
9306944
Compare
tool_calls
before processing by mistral-common when MistralTokenizer is usedtool_calls
iterator before processing by mistral-common when MistralTokenizer is used
tool_calls
iterator before processing by mistral-common when MistralTokenizer is usedtool_calls
iterator before processing by mistral-common when MistralTokenizer is used
tool_calls
iterator before processing by mistral-common when MistralTokenizer is usedtool_calls
iterator before processing by mistral-common when MistralTokenizer is used
Signed-off-by: Guillaume Calmettes <gcalmettes@scaleway.com>
1e8c3b1
to
8aa0146
Compare
Thanks a bunch for the PR! @gcalmettes do you have an easy reproducible on this error by any chance? (which is fixed by this PR?) |
Sure, please find below a code that breaks (short version for isolated example and long turn by turn version). Short version:
Turn by turn (long) version: It's basically an adaptation of your offline_chat_with_tool_exemple example, but querying the vllm openai frontend. The problem occurs when the assistant message gets a parsed tool_call.
without this PR, the code breaks on the server side with:
|
Nice catch! Just verified this and this fix is indeed needed to correctly parse tool calls. @ywang96 @DarkLight1337 do you think we could get something like this merged? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure!
…g by mistral-common when MistralTokenizer is used (vllm-project#9951) Signed-off-by: Guillaume Calmettes <gcalmettes@scaleway.com> Signed-off-by: Sumit Dubey <sumit.dubey2@ibm.com>
…g by mistral-common when MistralTokenizer is used (vllm-project#9951) Signed-off-by: Guillaume Calmettes <gcalmettes@scaleway.com>
…g by mistral-common when MistralTokenizer is used (vllm-project#9951) Signed-off-by: Guillaume Calmettes <gcalmettes@scaleway.com> Signed-off-by: Maxime Fournioux <55544262+mfournioux@users.noreply.github.com>
…g by mistral-common when MistralTokenizer is used (vllm-project#9951) Signed-off-by: Guillaume Calmettes <gcalmettes@scaleway.com> Signed-off-by: rickyx <rickyx@anyscale.com>
…g by mistral-common when MistralTokenizer is used (vllm-project#9951) Signed-off-by: Guillaume Calmettes <gcalmettes@scaleway.com> Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
…g by mistral-common when MistralTokenizer is used (vllm-project#9951) Signed-off-by: Guillaume Calmettes <gcalmettes@scaleway.com>
There is currently a bug in the Pydantic library where attributes declared as iterables are eagerly evaluated and therefore replaced in the instances by pydantic-core
ValidatorIterator
instance, when waiting to be consumed.When the
MistralTokenizer
is used, no chat template is applied, and the request messages are directly sent to be processed bymistral-common
. As a result, thetool_calls
field, which is defined asIterable
in the Assitant message request object definition (both in the vllm CustomChatCompletionMessageParam or the official OpenAI ChatCompletionAssistantMessageParam) is not consumed and is sent as a pydanticValidatorIterator
tomistral-common
:The
tool_calls
field is then rightfully processed as an empty list bymistral-common
, meaning that if anytool_calls
is sent with an assistant message, they are ignored after processing bymistral-common
:This causes the
mistral-common
validation check (here) to fail as both side are evaluated to false.The bug is not seen when chat templates are used, since the
tools_calls
iterator is consumed in the template when looping over eachtool_call
.The bug is known on Pydantic side, and it indeed particularly affects the
tool_calls
field for LLM-based workloads using the OpenAI client (see this issue for exemple).This PR makes the
tool_calls
received in the request for the assistant messages being consumed (and therefore valided) when theMistralTokenizer
is used.FIX #9059