Open
Description
Confirm this is a feature request for the Python library and not the underlying OpenAI API.
- This is a feature request for the Python library
Describe the feature or improvement you're requesting
I noticed that you guys are doing some manipulation of Pydantic's generated schema to ensure compatibility with the API's schema validation. I found a few more instances that can be addressed:
Issues:
- optional fields with pydantic defaults generate an unsupported 'default' field in the schema
- date fields generate a format='date-time' field in the schema which is not supported
The test cases below builds on your to_strict_json_schema
function and removes addresses these problematic fields with the remove_property_from_schema
function:
class Publisher(BaseModel):
name: str = Field(description="The name publisher")
url: Optional[str] = Field(None, description="The URL of the publisher's website")
class Config:
json_schema_extra = {
"additionalProperties": False
}
class Article(BaseModel):
title: str = Field(description="The title of the news article")
published: Optional[datetime] = Field(None, description="The date the article was published. Use ISO 8601 to format this value.")
publisher: Optional[Publisher] = Field(None, description="The publisher of the article")
class Config:
json_schema_extra = {
"additionalProperties": False
}
class NewsArticles(BaseModel):
query: str = Field(description="The query used to search for news articles")
articles: List[Article] = Field(description="The list of news articles returned by the query")
class Config:
json_schema_extra = {
"additionalProperties": False
}
def test_schema_compatible():
client = OpenAI()
# build on the internals that the openai client uses to clean up the pydantic schema for the openai API
schema = to_strict_json_schema(NewsArticles)
# optional fields with pydantic defaults generate an unsupported 'default' field in the schema
remove_property_from_schema(schema, "default")
# date fields generate a format='date-time' field in the schema which is not supported
remove_property_from_schema(schema, "format")
logger.info("Generated Schema: %s", json.dumps(schema, indent=2))
completion = client.beta.chat.completions.parse(
model="gpt-4o-2024-08-06",
temperature=0,
messages=[
{
"role": "user",
"content": "What where the top headlines in the US for January 6th, 2021?",
}
],
response_format={
"type": "json_schema",
"json_schema": {
"schema": schema,
"name": "NewsArticles",
"strict": True,
}
}
)
result = NewsArticles.model_validate_json(completion.choices[0].message.content)
assert result is not None
def remove_property_from_schema(schema: dict, property_name: str):
if 'properties' in schema:
for field_name, field in schema['properties'].items():
if 'properties' in field:
remove_property_from_schema(field, property_name)
if 'anyOf' in field:
for any_of in field['anyOf']:
any_of.pop(property_name, None)
field.pop(property_name, None)
if '$defs' in schema:
for definition_name, definition in schema['$defs'].items():
remove_property_from_schema(definition, property_name)
Additional context
No response
Metadata
Metadata
Assignees
Labels
No labels