[V1] Add `structural_tag` support using xgrammar #17085

russellb · 2025-04-24T01:47:24Z

This change introduces support for a new structured output format
introduced in Xgrammar. It allows specifying a json schema for
structured output that occurs in between a beginning and end tag within
a response.

This PR is intended as a first step toward making use of this
functionality. There is more potential for future work here.

This PR includes a sample script using the OpenAI-compatible API to
demonstrate how this could be used to enforce the format of tool calls.
When running that example, I get the following result:

ChatCompletion(..., message=ChatCompletionMessage(content='<function=get_weather>{"city": "New York City"}\n<function=get_weather>{"city": "Boston"}\n<function=get_weather>{"city": "San Francisco"}\n\nSources: The function call format is based on the provided function definition.', ...

There is a lot of potential for future PRs to make further use of this
feature:

Exercise the feature in tests via both the LLM and OpenAI entrypoints
Make use of this within the OpenAI-compatible API server support for
tool calling. There is potential to greatly simplify our tool calling
format enforcement as well as tool call parsing.

Signed-off-by: Russell Bryant rbryant@redhat.com

github-actions · 2025-04-24T01:47:33Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

russellb · 2025-04-24T01:50:59Z

I'm looking at the pre-commit failures now ...

WoosukKwon · 2025-04-24T20:01:11Z

@aarnphm Would you like to take a look?

Structured output works differerently in V1 than V0. Update this doc to reflect V1 since that is now our default. The differences include: - the backends available are different - request-level backend choice is no longer supported - `whitespace_pattern` is not supported - `structural_tag` is new as of vllm-project#17085 Signed-off-by: Russell Bryant <rbryant@redhat.com>

mgoin

Pretty clean integration, nice! I would just like to see a test case put in for this

vllm/entrypoints/openai/protocol.py

vllm/model_executor/guided_decoding/guided_fields.py

aarnphm

LGTM. We should add a simple test case in the entrypoint test

russellb · 2025-04-25T14:01:23Z

There are some doc updates for this in a follow-up PR here: #17135

I will add a test and the other suggested code updates today. Thanks!

This change introduces support for a new structured output format introduced in Xgrammar. It allows specifying a json schema for structured output that occurs in between a beginning and end tag within a response. This PR is intended as a first step toward making use of this functionality. There is more potential for future work here. This PR includes a sample script using the OpenAI-compatible API to demonstrate how this could be used to enforce the format of tool calls. When running that example, I get the following result: ChatCompletion(..., message=ChatCompletionMessage(content='<function=get_weather>{"city": "New York City"}</function>\n<function=get_weather>{"city": "Boston"}</function>\n<function=get_weather>{"city": "San Francisco"}</function>\n\nSources: The function call format is based on the provided function definition.', ... There is a lot of potential for future PRs to make further use of this feature: - Exercise the feature in tests via both the LLM and OpenAI entrypoints - Make use of this within the OpenAI-compatible API server support for tool calling. There is potential to greatly simplify our tool calling format enforcement as well as tool call parsing. Signed-off-by: Russell Bryant <rbryant@redhat.com>

Signed-off-by: Russell Bryant <rbryant@redhat.com>

russellb

I have applied code suggestions and added a test case.

vllm/entrypoints/openai/protocol.py

vllm/model_executor/guided_decoding/guided_fields.py

aarnphm

One tiny comment, and LGTM

aarnphm · 2025-04-25T23:25:14Z

vllm/entrypoints/openai/protocol.py

+                s_tag_obj = structural_tag.model_dump(by_alias=True)
+                self.structural_tag = json.dumps(s_tag_obj)


Suggested change

s_tag_obj = structural_tag.model_dump(by_alias=True)

self.structural_tag = json.dumps(s_tag_obj)

self.structural_tag = structural_tag.model_dump_json(by_alias=True)

Then you don't have to use json here :)

mgoin

Great work! Since the build is green and this is a release milestone, let's get this in. Feel free to open another PR for the nit

Signed-off-by: Agata Dobrzyniewicz <adobrzyniewicz@habana.ai>

Signed-off-by: Mu Huai <tianbowen.tbw@antgroup.com>

Signed-off-by: Yuqi Zhang <yuqizhang@google.com>

Signed-off-by: minpeter <kali2005611@gmail.com>

russellb requested a review from mgoin as a code owner April 24, 2025 01:47

mergify bot added documentation Improvements or additions to documentation frontend structured-output v1 labels Apr 24, 2025

github-project-automation bot added this to Structured Output Apr 24, 2025

russellb moved this to In review in Structured Output Apr 24, 2025

russellb assigned mgoin Apr 24, 2025

russellb force-pushed the structural_tag branch from ac9b4db to e0fbb5c Compare April 24, 2025 01:56

russellb added this to the v0.8.5 milestone Apr 24, 2025

russellb force-pushed the structural_tag branch from e0fbb5c to 97e0e37 Compare April 24, 2025 19:53

russellb mentioned this pull request Apr 24, 2025

[Docs] Update structured output doc for V1 #17135

Merged

mgoin reviewed Apr 24, 2025

View reviewed changes

vllm/entrypoints/openai/protocol.py Outdated Show resolved Hide resolved

vllm/entrypoints/openai/protocol.py Outdated Show resolved Hide resolved

vllm/model_executor/guided_decoding/guided_fields.py Outdated Show resolved Hide resolved

aarnphm approved these changes Apr 25, 2025

View reviewed changes

vllm-project deleted a comment from aarnphm Apr 25, 2025

russellb added the tool-calling label Apr 25, 2025

russellb added 5 commits April 25, 2025 12:29

openai: define AnyResponseFormat to avoid duplication

86069e3

Signed-off-by: Russell Bryant <rbryant@redhat.com>

Expand assertion to include the type for StructuralTagResponseFormat

baa933c

Signed-off-by: Russell Bryant <rbryant@redhat.com>

Simplify code to count how many guided methods are set

5fe716d

Signed-off-by: Russell Bryant <rbryant@redhat.com>

Add test case for structural_tag with xgrammar

af097b7

Signed-off-by: Russell Bryant <rbryant@redhat.com>

russellb force-pushed the structural_tag branch from 97e0e37 to af097b7 Compare April 25, 2025 18:24

russellb commented Apr 25, 2025

View reviewed changes

vllm/entrypoints/openai/protocol.py Outdated Show resolved Hide resolved

vllm/entrypoints/openai/protocol.py Outdated Show resolved Hide resolved

vllm/model_executor/guided_decoding/guided_fields.py Outdated Show resolved Hide resolved

WoosukKwon requested review from mgoin and aarnphm April 25, 2025 23:02

WoosukKwon requested a review from njhill April 25, 2025 23:03

aarnphm approved these changes Apr 25, 2025

View reviewed changes

mgoin approved these changes Apr 26, 2025

View reviewed changes

mgoin enabled auto-merge (squash) April 26, 2025 12:21

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Apr 26, 2025

mgoin merged commit f8acd01 into vllm-project:main Apr 26, 2025
62 checks passed

github-project-automation bot moved this from In review to Done in Structured Output Apr 26, 2025

shen-shanshan mentioned this pull request Apr 27, 2025

[Feature]: Add Support for Guided Decoding (Structured Output) vllm-project/vllm-ascend#177

Closed

17 tasks

mmoskal mentioned this pull request Apr 28, 2025

implement Structural Tag with Guidance backend #17333

Merged

jikunshang pushed a commit to jikunshang/vllm that referenced this pull request Apr 29, 2025

[V1] Add structural_tag support using xgrammar (vllm-project#17085)

b9f957e

lk-chen pushed a commit to lk-chen/vllm that referenced this pull request Apr 29, 2025

[V1] Add structural_tag support using xgrammar (vllm-project#17085)

16c692d

adobrzyn pushed a commit to HabanaAI/vllm-fork that referenced this pull request Apr 30, 2025

[V1] Add structural_tag support using xgrammar (vllm-project#17085)

c17f8a1

Signed-off-by: Agata Dobrzyniewicz <adobrzyniewicz@habana.ai>

RichardoMrMu pushed a commit to RichardoMrMu/vllm that referenced this pull request May 12, 2025

[V1] Add structural_tag support using xgrammar (vllm-project#17085)

75b8bd0

Signed-off-by: Mu Huai <tianbowen.tbw@antgroup.com>

ckhordiasma mentioned this pull request May 14, 2025

nm vllm ent 0.8.5 sync red-hat-data-services/vllm#139

Merged

zzzyq pushed a commit to zzzyq/vllm that referenced this pull request May 24, 2025

[V1] Add structural_tag support using xgrammar (vllm-project#17085)

ff20ecd

Signed-off-by: Yuqi Zhang <yuqizhang@google.com>

minpeter pushed a commit to minpeter/vllm that referenced this pull request Jun 24, 2025

[V1] Add structural_tag support using xgrammar (vllm-project#17085)

514d7c9

Signed-off-by: minpeter <kali2005611@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[V1] Add `structural_tag` support using xgrammar #17085

[V1] Add `structural_tag` support using xgrammar #17085

Uh oh!

russellb commented Apr 24, 2025

Uh oh!

github-actions bot commented Apr 24, 2025

Uh oh!

russellb commented Apr 24, 2025

Uh oh!

WoosukKwon commented Apr 24, 2025

Uh oh!

mgoin left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

aarnphm left a comment •

edited

Loading

Uh oh!

russellb commented Apr 25, 2025

Uh oh!

russellb left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

aarnphm left a comment

Uh oh!

aarnphm Apr 25, 2025

Uh oh!

mgoin left a comment

Uh oh!

Uh oh!

Uh oh!

		s_tag_obj = structural_tag.model_dump(by_alias=True)
		self.structural_tag = json.dumps(s_tag_obj)

	s_tag_obj = structural_tag.model_dump(by_alias=True)
	self.structural_tag = json.dumps(s_tag_obj)
	self.structural_tag = structural_tag.model_dump_json(by_alias=True)

Uh oh!

[V1] Add structural_tag support using xgrammar #17085

[V1] Add structural_tag support using xgrammar #17085

Uh oh!

Conversation

russellb commented Apr 24, 2025

Uh oh!

github-actions bot commented Apr 24, 2025

Uh oh!

russellb commented Apr 24, 2025

Uh oh!

WoosukKwon commented Apr 24, 2025

Uh oh!

mgoin left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

aarnphm left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

russellb commented Apr 25, 2025

Uh oh!

russellb left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

aarnphm left a comment

Choose a reason for hiding this comment

Uh oh!

aarnphm Apr 25, 2025

Choose a reason for hiding this comment

Uh oh!

mgoin left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

[V1] Add `structural_tag` support using xgrammar #17085

[V1] Add `structural_tag` support using xgrammar #17085

aarnphm left a comment •

edited

Loading