-
Notifications
You must be signed in to change notification settings - Fork 809
Generalize the JSON schema transformations #1481
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generalize the JSON schema transformations #1481
Conversation
Docs Preview
|
Yeah, I agree we need the |
min_length = schema.pop('minLength', None) | ||
max_length = schema.pop('minLength', None) | ||
if description is not None: | ||
notes = list[str]() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you do this in 3.9?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❯ UV_PROJECT_ENVIRONMENT=.venv39 uv run --python 3.9 --all-extras --all-packages python
warning: `VIRTUAL_ENV=.venv` does not match the project environment path `.venv39` and will be ignored; use `--active` to target the active environment instead
Python 3.9.20 (main, Sep 6 2024, 19:03:56)
[Clang 15.0.0 (clang-1500.3.9.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> list[str]()
[]
Yes, but I guess I should probably use a type hint for better perf. Not that it will matter in this code path.
This moves the logic for modifying JSON schema into a shared
WalkJsonSchema
class to make it easier to reuse across different models, and to make it easier to implement JSON schema modifications in different vendor-specific models.I suspect we may want to make it easier for users to customize and/or override this; in particular, we probably want a way for a user to hard-code the specific schema they want to use for a given tool (possibly even on a per-model basis, in the event of fallback/manually switching providers?). E.g., if they want to control how much duplication there is in the generated JSON schema if reusing a given type in a few places. But I think let's not worry about that for now.
This PR already makes a number of fixes to handling of JSON schemas with OpenAI and Gemini. I used the following very useful script, tweaked to test different schemas and models. (Note — turning verbose console logging has the extremely nice benefit of it printing out the tool schema to the console, so you can see the result of the transformation.)
@Kludex it might be nice if we had some system for testing that the transformations we're using are required, I guess we could use VCR or something but I more want to test that I get errors for certain schemas if I don't make transformations to the schema, which is kind of annoying. I guess we could monkeypatch the transformation function to be a no-op ... 😩. I think for now we can skip it but I wouldn't mind if you did want to add it 😄.