Generalize the JSON schema transformations #1481

dmontagu · 2025-04-15T00:35:50Z

This moves the logic for modifying JSON schema into a shared WalkJsonSchema class to make it easier to reuse across different models, and to make it easier to implement JSON schema modifications in different vendor-specific models.

I suspect we may want to make it easier for users to customize and/or override this; in particular, we probably want a way for a user to hard-code the specific schema they want to use for a given tool (possibly even on a per-model basis, in the event of fallback/manually switching providers?). E.g., if they want to control how much duplication there is in the generated JSON schema if reusing a given type in a few places. But I think let's not worry about that for now.

This PR already makes a number of fixes to handling of JSON schemas with OpenAI and Gemini. I used the following very useful script, tweaked to test different schemas and models. (Note — turning verbose console logging has the extremely nice benefit of it printing out the tool schema to the console, so you can see the result of the transformation.)

from typing import Literal, Annotated

import logfire
from pydantic import BaseModel, Discriminator

from pydantic_ai import Agent

logfire.instrument_pydantic_ai()
logfire.configure(send_to_logfire=False, console=logfire.ConsoleOptions(verbose=True))


class MyA(BaseModel):
    kind: Literal['a']


class MyB(BaseModel):
    kind: Literal['b']


class Output(BaseModel):
    x: Annotated[MyA | MyB, Discriminator('kind')]


agent = Agent(
    'google-gla:gemini-2.5-pro-exp-03-25',
    output_type=Output,
    system_prompt='You are a helpful assistant.',
)

result = agent.run_sync(
    user_prompt='Use the final_result tool to generate a response. I just want you to generate some example data compatible with the schema.'
)
print(result.output)

@Kludex it might be nice if we had some system for testing that the transformations we're using are required, I guess we could use VCR or something but I more want to test that I get errors for certain schemas if I don't make transformations to the schema, which is kind of annoying. I guess we could monkeypatch the transformation function to be a no-op ... 😩. I think for now we can skip it but I wouldn't mind if you did want to add it 😄.

github-actions · 2025-04-15T00:42:21Z

Docs Preview

commit:	`fc581c6`
Preview URL:	https://15449ba3-pydantic-ai-previews.pydantic.workers.dev

Kludex · 2025-04-15T09:30:47Z

Yeah, I agree we need the RecordModel.

Kludex · 2025-04-15T11:24:58Z

pydantic_ai_slim/pydantic_ai/models/openai.py

+        min_length = schema.pop('minLength', None)
+        max_length = schema.pop('minLength', None)
+        if description is not None:
+            notes = list[str]()


Can you do this in 3.9?

❯ UV_PROJECT_ENVIRONMENT=.venv39 uv run --python 3.9 --all-extras --all-packages python warning: `VIRTUAL_ENV=.venv` does not match the project environment path `.venv39` and will be ignored; use `--active` to target the active environment instead Python 3.9.20 (main, Sep 6 2024, 19:03:56) [Clang 15.0.0 (clang-1500.3.9.4)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> list[str]() []

Yes, but I guess I should probably use a type hint for better perf. Not that it will matter in this code path.

Generalize the JSON schema transformations

ab178c8

dmontagu added 2 commits April 14, 2025 22:58

More JSON schema improvements

e7dbad7

Try fixing coverage

f11cf82

Kludex approved these changes Apr 15, 2025

View reviewed changes

Kludex assigned dmontagu Apr 15, 2025

dmontagu added 3 commits April 15, 2025 13:23

Update handling of prefixItems for gemini

7f3bd9d

Merge branch 'main' into dmontagu/generalize-json-schema-transformations

86ffec1

Fix coverage

fc581c6

dmontagu merged commit cc18937 into main Apr 15, 2025
17 checks passed

dmontagu deleted the dmontagu/generalize-json-schema-transformations branch April 15, 2025 19:40

DouweM mentioned this pull request Apr 16, 2025

OpenAI strict mode inferred incorrectly #1488

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generalize the JSON schema transformations #1481

Generalize the JSON schema transformations #1481

dmontagu commented Apr 15, 2025 •

edited

Loading

github-actions bot commented Apr 15, 2025 •

edited

Loading

Kludex commented Apr 15, 2025

Kludex Apr 15, 2025

dmontagu Apr 15, 2025

Generalize the JSON schema transformations #1481

Generalize the JSON schema transformations #1481

Conversation

dmontagu commented Apr 15, 2025 • edited Loading

github-actions bot commented Apr 15, 2025 • edited Loading

Docs Preview

Kludex commented Apr 15, 2025

Kludex Apr 15, 2025

Choose a reason for hiding this comment

dmontagu Apr 15, 2025

Choose a reason for hiding this comment

dmontagu commented Apr 15, 2025 •

edited

Loading

github-actions bot commented Apr 15, 2025 •

edited

Loading