Closed
Description
Your current environment
using vllm/vllm-openai:v0.6.6.post1
docker image
Model Input Dumps
No response
🐛 Describe the bug
VLLM server fails to conform guided decoding with JSON schema minItems
and maxItems
for array types.
from openai import OpenAI
from pydantic import BaseModel
client = OpenAI(
base_url="http://localhost:8000/v1",
api_key="-",
)
class Person(BaseModel):
names: list[str]: Field(..., min_length=2, max_length=2)
print(Person.model_json_schema()) # {'properties': {'names': {'items': {'type': 'string'}, 'maxItems': 2, 'minItems': 2, 'title': 'Names', 'type': 'array'}}, 'required': ['names'], 'title': 'Person', 'type': 'object'}
response = client.chat.completions.create(
model='aya-23-35b',
messages=[
{
'role': 'user',
'content': 'Generate 4 names. Respond in json format.'
}
],
extra_body={"guided_json": Person.model_json_schema()}
)
print(response.choices[0].message.content)
- 2 names should be generated due to JSON constraint, but 4 names is still generated
Fallback to outlines
The default grammar backend is now xgrammar
, and it currently still does not support the following keywords for array
:
std::string JSONSchemaConverter::VisitArray(
const picojson::object& schema, const std::string& rule_name
) {
XGRAMMAR_CHECK(
(schema.count("type") && schema.at("type").get<std::string>() == "array") ||
schema.count("items") || schema.count("prefixItems") || schema.count("unevaluatedItems")
);
WarnUnsupportedKeywords(
schema,
{
"uniqueItems",
"contains",
"minContains",
"maxContains",
"minItems",
"maxItems",
}
);
but the vllm fallback code does not check for these keywords, so fallback to outlines did not happen.
vllm/vllm/model_executor/guided_decoding/utils.py
Lines 4 to 35 in 2339d59
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.