Skip to content

[Bug]: minItems and maxItems json schema constraint fails on xgrammar and did not fallback to outlines #12201

Closed
@Jason-CKY

Description

@Jason-CKY

Your current environment

using vllm/vllm-openai:v0.6.6.post1 docker image

Model Input Dumps

No response

🐛 Describe the bug

VLLM server fails to conform guided decoding with JSON schema minItems and maxItems for array types.

from openai import OpenAI
from pydantic import BaseModel

client = OpenAI(
    base_url="http://localhost:8000/v1",
    api_key="-",
)

class Person(BaseModel):
  names: list[str]: Field(..., min_length=2, max_length=2)

print(Person.model_json_schema())  # {'properties': {'names': {'items': {'type': 'string'}, 'maxItems': 2, 'minItems': 2, 'title': 'Names', 'type': 'array'}}, 'required': ['names'], 'title': 'Person', 'type': 'object'}

response = client.chat.completions.create(
  model='aya-23-35b',
  messages=[
    {
      'role': 'user',
      'content': 'Generate 4 names. Respond in json format.'
    }
  ],
  extra_body={"guided_json": Person.model_json_schema()}
)
print(response.choices[0].message.content)
  • 2 names should be generated due to JSON constraint, but 4 names is still generated

Fallback to outlines

The default grammar backend is now xgrammar, and it currently still does not support the following keywords for array:

https://github.com/mlc-ai/xgrammar/blob/c1b64920cad24f44f235778c1c00bb52d57da01a/cpp/json_schema_converter.cc#L975-L992

std::string JSONSchemaConverter::VisitArray(
    const picojson::object& schema, const std::string& rule_name
) {
  XGRAMMAR_CHECK(
      (schema.count("type") && schema.at("type").get<std::string>() == "array") ||
      schema.count("items") || schema.count("prefixItems") || schema.count("unevaluatedItems")
  );
  WarnUnsupportedKeywords(
      schema,
      {
          "uniqueItems",
          "contains",
          "minContains",
          "maxContains",
          "minItems",
          "maxItems",
      }
  );

but the vllm fallback code does not check for these keywords, so fallback to outlines did not happen.

def has_xgrammar_unsupported_json_features(schema: dict) -> bool:
"""Check if JSON schema contains features unsupported by xgrammar."""
def check_object(obj: dict) -> bool:
if not isinstance(obj, dict):
return False
# Check for pattern restrictions
if "pattern" in obj:
return True
# Check for numeric ranges
if obj.get("type") in ("integer", "number") and any(
key in obj for key in [
"minimum", "maximum", "exclusiveMinimum",
"exclusiveMaximum", "multipleOf"
]):
return True
# Recursively check all nested objects and arrays
for value in obj.values():
if isinstance(value, dict):
if check_object(value):
return True
elif isinstance(value, list):
for item in value:
if isinstance(item, dict) and check_object(item):
return True
return False
return check_object(schema)

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions