Skip to content

[V1] Add structural_tag support using xgrammar #17085

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Apr 26, 2025

Conversation

russellb
Copy link
Member

This change introduces support for a new structured output format
introduced in Xgrammar. It allows specifying a json schema for
structured output that occurs in between a beginning and end tag within
a response.

This PR is intended as a first step toward making use of this
functionality. There is more potential for future work here.

This PR includes a sample script using the OpenAI-compatible API to
demonstrate how this could be used to enforce the format of tool calls.
When running that example, I get the following result:

ChatCompletion(..., message=ChatCompletionMessage(content='<function=get_weather>{"city": "New York City"}\n<function=get_weather>{"city": "Boston"}\n<function=get_weather>{"city": "San Francisco"}\n\nSources: The function call format is based on the provided function definition.', ...

There is a lot of potential for future PRs to make further use of this
feature:

  • Exercise the feature in tests via both the LLM and OpenAI entrypoints

  • Make use of this within the OpenAI-compatible API server support for
    tool calling. There is potential to greatly simplify our tool calling
    format enforcement as well as tool call parsing.

Signed-off-by: Russell Bryant rbryant@redhat.com

@russellb russellb requested a review from mgoin as a code owner April 24, 2025 01:47
Copy link

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

@mergify mergify bot added documentation Improvements or additions to documentation frontend structured-output v1 labels Apr 24, 2025
@russellb russellb moved this to In review in Structured Output Apr 24, 2025
@russellb
Copy link
Member Author

I'm looking at the pre-commit failures now ...

@russellb russellb added this to the v0.8.5 milestone Apr 24, 2025
@WoosukKwon
Copy link
Collaborator

@aarnphm Would you like to take a look?

russellb added a commit to russellb/vllm that referenced this pull request Apr 24, 2025
Structured output works differerently in V1 than V0. Update this doc to
reflect V1 since that is now our default. The differences include:

- the backends available are different
- request-level backend choice is no longer supported
- `whitespace_pattern` is not supported
- `structural_tag` is new as of vllm-project#17085

Signed-off-by: Russell Bryant <rbryant@redhat.com>
Copy link
Member

@mgoin mgoin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty clean integration, nice! I would just like to see a test case put in for this

Copy link
Collaborator

@aarnphm aarnphm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. We should add a simple test case in the entrypoint test

@vllm-project vllm-project deleted a comment from aarnphm Apr 25, 2025
@russellb
Copy link
Member Author

There are some doc updates for this in a follow-up PR here: #17135

I will add a test and the other suggested code updates today. Thanks!

This change introduces support for a new structured output format
introduced in Xgrammar. It allows specifying a json schema for
structured output that occurs in between a beginning and end tag within
a response.

This PR is intended as a first step toward making use of this
functionality. There is more potential for future work here.

This PR includes a sample script using the OpenAI-compatible API to
demonstrate how this could be used to enforce the format of tool calls.
When running that example, I get the following result:

ChatCompletion(..., message=ChatCompletionMessage(content='<function=get_weather>{"city": "New York City"}</function>\n<function=get_weather>{"city": "Boston"}</function>\n<function=get_weather>{"city": "San Francisco"}</function>\n\nSources: The function call format is based on the provided function definition.', ...

There is a lot of potential for future PRs to make further use of this
feature:

- Exercise the feature in tests via both the LLM and OpenAI entrypoints

- Make use of this within the OpenAI-compatible API server support for
  tool calling. There is potential to greatly simplify our tool calling
  format enforcement as well as tool call parsing.

Signed-off-by: Russell Bryant <rbryant@redhat.com>
Signed-off-by: Russell Bryant <rbryant@redhat.com>
Signed-off-by: Russell Bryant <rbryant@redhat.com>
Signed-off-by: Russell Bryant <rbryant@redhat.com>
Signed-off-by: Russell Bryant <rbryant@redhat.com>
Copy link
Member Author

@russellb russellb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have applied code suggestions and added a test case.

@WoosukKwon WoosukKwon requested review from mgoin and aarnphm April 25, 2025 23:02
@WoosukKwon WoosukKwon requested a review from njhill April 25, 2025 23:03
Copy link
Collaborator

@aarnphm aarnphm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One tiny comment, and LGTM

Comment on lines +507 to +508
s_tag_obj = structural_tag.model_dump(by_alias=True)
self.structural_tag = json.dumps(s_tag_obj)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
s_tag_obj = structural_tag.model_dump(by_alias=True)
self.structural_tag = json.dumps(s_tag_obj)
self.structural_tag = structural_tag.model_dump_json(by_alias=True)

Then you don't have to use json here :)

Copy link
Member

@mgoin mgoin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! Since the build is green and this is a release milestone, let's get this in. Feel free to open another PR for the nit

@mgoin mgoin enabled auto-merge (squash) April 26, 2025 12:21
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Apr 26, 2025
@mgoin mgoin merged commit f8acd01 into vllm-project:main Apr 26, 2025
62 checks passed
@github-project-automation github-project-automation bot moved this from In review to Done in Structured Output Apr 26, 2025
jikunshang pushed a commit to jikunshang/vllm that referenced this pull request Apr 29, 2025
lk-chen pushed a commit to lk-chen/vllm that referenced this pull request Apr 29, 2025
adobrzyn pushed a commit to HabanaAI/vllm-fork that referenced this pull request Apr 30, 2025
Signed-off-by: Agata Dobrzyniewicz <adobrzyniewicz@habana.ai>
RichardoMrMu pushed a commit to RichardoMrMu/vllm that referenced this pull request May 12, 2025
Signed-off-by: Mu Huai <tianbowen.tbw@antgroup.com>
zzzyq pushed a commit to zzzyq/vllm that referenced this pull request May 24, 2025
Signed-off-by: Yuqi Zhang <yuqizhang@google.com>
minpeter pushed a commit to minpeter/vllm that referenced this pull request Jun 24, 2025
Signed-off-by: minpeter <kali2005611@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation frontend ready ONLY add when PR is ready to merge/full CI is needed structured-output tool-calling v1
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

5 participants