Skip to content

[Bug]: xgrammar==0.17 not work when guided #15790

Open
@jacksonjack001

Description

@jacksonjack001

Your current environment

from pydantic import BaseModel
from openai import OpenAI
import json
import tiktoken
from pydantic import BaseModel, Field
from typing import Annotated

def count_tokens(text, model="gpt-3.5-turbo"):
"""计算文本的token数量"""
encoder = tiktoken.encoding_for_model(model)
tokens = encoder.encode(text)
return len(tokens)

使用示例

text = "你好,这是一个示例文本。"
token_count = count_tokens(text)
print(f"Token数量: {token_count}")

class Info(BaseModel):
name: Annotated[str, Field(max_length=10)]
age: int

json_schema = Info.model_json_schema()
print(json_schema)
print(count_tokens(json.dumps(json_schema)))

client = OpenAI(
api_key="sk-aa27cd8dfad346a6b576e51e68aa7283",
base_url="http://127.0.0.1:5000/v1",
)

char_ls = ["东", "男", "西", "被"]
import numpy as np

for i in range(10):
ii = np.random.randint(len(char_ls))
char_zifu = char_ls[ii]
print(char_zifu * 10)
ques = char_zifu * 10000

completion = client.chat.completions.create(
    model="Qwen2.5-VL-3B-Instruct",
    messages=[
        {
            "role": "system",
            "content": "You must respond with JSON containing name and age fields.",
        },
        {"role": "user", "content": ques[:2000]},
    ],
    extra_body={
        "guided_json": json_schema,
        "guided_decoding_backend": "xgrammar",  # 尝试不同的后端lm-format-enforcer
        # "use_cache": False,
    },
    # temperature=0.01,
    # top_p=0.9,
)
content = completion.choices[0].message.content
print(content)

手动解析JSON响应

import json

response_json = json.loads(content)
print(f"Name: {response_json['name']}")
print(f"Age: {response_json['age']}")

🐛 Describe the bug

it shoud worked, but it shows like this

root@ai-test:/data_ext/qwen# /usr/bin/python3 /data_ext/qwen/vllm_ocr/json_api_backend.py
Token数量: 11
{'properties': {'name': {'maxLength': 10, 'title': 'Name', 'type': 'string'}, 'age': {'title': 'Age', 'type': 'integer'}}, 'required': ['name', 'age'], 'title': 'Info', 'type': 'object'}
61
东东东东东东东东东东
{"action": "answer", "name": "东东东东东东东东东东东东东东东东东东东东东东东东东东东东东东东东东东东东东东东东东东东东东东东东东东东东东东东东东东东
男男男男男男男男男男
{"name": "男", "age": 18}
西西西西西西西西西西
{"
西西西西西西西西西西
{"
男男男男男男男男男男

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions