Description
Initial Checks
- I confirm that I'm using the latest version of Pydantic AI
- I confirm that I searched for my issue in https://github.com/pydantic/pydantic-ai/issues before opening this issue
Description
In v0.0.53 of pydantic ai, when using structured outputs and the instrument=True argument when creating a gemini Agent, the wrong JSON schema is generated and sent to the Gemini API, resulting in the following error:
pydantic_ai.exceptions.ModelHTTPError: status_code: 400, model_name: gemini-2.0-flash, body: {
"error": {
"code": 400,
"message": "Invalid JSON payload received. Unknown name \"$defs\" at 'tools.function_declarations[0].parameters': Cannot find field.\nInvalid JSON payload received. Unknown name \"$ref\" at 'tools.function_declarations[0].parameters.properties[0].value.items': Cannot find field.",
"status": "INVALID_ARGUMENT",
"details": [
{
"@type": "type.googleapis.com/google.rpc.BadRequest",
"fieldViolations": [
{
"field": "tools.function_declarations[0].parameters",
"description": "Invalid JSON payload received. Unknown name \"$defs\" at 'tools.function_declarations[0].parameters': Cannot find field."
},
{
"field": "tools.function_declarations[0].parameters.properties[0].value.items",
"description": "Invalid JSON payload received. Unknown name \"$ref\" at 'tools.function_declarations[0].parameters.properties[0].value.items': Cannot find field."
}
]
}
]
}
}
This error does not occur in v0.0.52 or below.
Change the pydantic-ai version to v0.0.52 in the example code to see the difference. Run the script with uv run script.py
Example Code
# /// script
# requires-python = ">=3.11"
# dependencies = [
# "google-genai",
# "python-dotenv",
# "pydantic-ai-slim[vertexai]==v0.0.53",
# "pydantic",
# ]
# ///
from pydantic_ai import Agent, RunContext
from pydantic_ai.models.gemini import GeminiModel
from pydantic import BaseModel, Field
from typing import Optional
import dotenv
dotenv.load_dotenv()
model = GeminiModel(
"gemini-2.0-flash",
provider="google-vertex",
# GoogleVertexProvider(service_account_file=os.getenv("GOOGLE_APPLICATION_CREDENTIALS")),
)
class PowerSaleEntry(BaseModel):
source_document: str = Field(alias="Source document", description="URL of source document that contains this data")
year_month: str = Field(alias="Year-Month", description="Month and year of the data in yyyy-mm format")
source: str = Field(alias="Source", description="Source of the electricity bought")
kwh_purchased: float = Field(alias="kWh purchased")
basic_generation_cost: Optional[float] = Field(alias="Basic generation cost (PHP)")
total_generation_cost: float = Field(alias="Total generation cost for the month (PHP)")
price_per_kwh: Optional[float] = Field(
alias="Price per kWh",
description="Price of electricity bought, or alternatively, average generation cost (PHP/kWh)",
default=None,
)
class PowerSaleEntries(BaseModel):
entries: list[PowerSaleEntry] = Field(alias="Entries", description="List of electricity sales entries")
class Deps(BaseModel):
text: str = Field(description="Raw text containing electricity sales data")
extraction_agent = Agent(
model,
result_type=PowerSaleEntries,
system_prompt="You are an expert in electricity data.",
deps_type=Deps,
instrument=True
)
@extraction_agent.system_prompt
async def system_prompt(context: RunContext[Deps]) -> str:
return f"""
You are an expert in electricity data. Your task is to extract the electricity sales data from the text.
The text contains multiple entries, each with the following fields:
- Source document
- Year-Month
- Source
- kWh purchased
- Basic generation cost (PHP)
- Total generation cost for the month (PHP)
Please extract these fields and return them in a structured format.
This is the text you need to process:
{context.deps.text}
"""
async def extract_power_sales(text: str) -> PowerSaleEntries:
context = "Please extract the electricity sales data from the text."
result = await extraction_agent.run(context, deps=Deps(text=text))
return result
if __name__ == "__main__":
import asyncio
sample_text = """
Source document: https://example.com/doc1
Year-Month: 2023-01
Source: Coal Plant A
kWh purchased: 1000000
Basic generation cost (PHP): 500000
Total generation cost for the month (PHP): 550000
Source document: https://example.com/doc2
Year-Month: 2023-01
Source: Wind Farm B
kWh purchased: 200000
Basic generation cost (PHP): 100000
Total generation cost for the month (PHP): 120000
Source document: https://example.com/doc3
Year-Month: 2023-01
Source: Solar Plant C
kWh purchased: 300000
Basic generation cost (PHP): 150000
Total generation cost for the month (PHP): 180000
"""
result = asyncio.run(extract_power_sales(sample_text))
print(result.data)
Python, Pydantic AI & LLM client version
python 3.11
pydantic_ai 0.0.53