-
-
Notifications
You must be signed in to change notification settings - Fork 9.6k
[V1] Add structural_tag
support using xgrammar
#17085
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
I'm looking at the pre-commit failures now ... |
ac9b4db
to
e0fbb5c
Compare
e0fbb5c
to
97e0e37
Compare
@aarnphm Would you like to take a look? |
Structured output works differerently in V1 than V0. Update this doc to reflect V1 since that is now our default. The differences include: - the backends available are different - request-level backend choice is no longer supported - `whitespace_pattern` is not supported - `structural_tag` is new as of vllm-project#17085 Signed-off-by: Russell Bryant <rbryant@redhat.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pretty clean integration, nice! I would just like to see a test case put in for this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. We should add a simple test case in the entrypoint test
There are some doc updates for this in a follow-up PR here: #17135 I will add a test and the other suggested code updates today. Thanks! |
This change introduces support for a new structured output format introduced in Xgrammar. It allows specifying a json schema for structured output that occurs in between a beginning and end tag within a response. This PR is intended as a first step toward making use of this functionality. There is more potential for future work here. This PR includes a sample script using the OpenAI-compatible API to demonstrate how this could be used to enforce the format of tool calls. When running that example, I get the following result: ChatCompletion(..., message=ChatCompletionMessage(content='<function=get_weather>{"city": "New York City"}</function>\n<function=get_weather>{"city": "Boston"}</function>\n<function=get_weather>{"city": "San Francisco"}</function>\n\nSources: The function call format is based on the provided function definition.', ... There is a lot of potential for future PRs to make further use of this feature: - Exercise the feature in tests via both the LLM and OpenAI entrypoints - Make use of this within the OpenAI-compatible API server support for tool calling. There is potential to greatly simplify our tool calling format enforcement as well as tool call parsing. Signed-off-by: Russell Bryant <rbryant@redhat.com>
Signed-off-by: Russell Bryant <rbryant@redhat.com>
Signed-off-by: Russell Bryant <rbryant@redhat.com>
Signed-off-by: Russell Bryant <rbryant@redhat.com>
Signed-off-by: Russell Bryant <rbryant@redhat.com>
97e0e37
to
af097b7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have applied code suggestions and added a test case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One tiny comment, and LGTM
s_tag_obj = structural_tag.model_dump(by_alias=True) | ||
self.structural_tag = json.dumps(s_tag_obj) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s_tag_obj = structural_tag.model_dump(by_alias=True) | |
self.structural_tag = json.dumps(s_tag_obj) | |
self.structural_tag = structural_tag.model_dump_json(by_alias=True) |
Then you don't have to use json here :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work! Since the build is green and this is a release milestone, let's get this in. Feel free to open another PR for the nit
Signed-off-by: Agata Dobrzyniewicz <adobrzyniewicz@habana.ai>
Signed-off-by: Mu Huai <tianbowen.tbw@antgroup.com>
Signed-off-by: Yuqi Zhang <yuqizhang@google.com>
Signed-off-by: minpeter <kali2005611@gmail.com>
This change introduces support for a new structured output format
introduced in Xgrammar. It allows specifying a json schema for
structured output that occurs in between a beginning and end tag within
a response.
This PR is intended as a first step toward making use of this
functionality. There is more potential for future work here.
This PR includes a sample script using the OpenAI-compatible API to
demonstrate how this could be used to enforce the format of tool calls.
When running that example, I get the following result:
ChatCompletion(..., message=ChatCompletionMessage(content='<function=get_weather>{"city": "New York City"}\n<function=get_weather>{"city": "Boston"}\n<function=get_weather>{"city": "San Francisco"}\n\nSources: The function call format is based on the provided function definition.', ...
There is a lot of potential for future PRs to make further use of this
feature:
Exercise the feature in tests via both the LLM and OpenAI entrypoints
Make use of this within the OpenAI-compatible API server support for
tool calling. There is potential to greatly simplify our tool calling
format enforcement as well as tool call parsing.
Signed-off-by: Russell Bryant rbryant@redhat.com