Structured output from any LLM in one line.
Dict schema or Pydantic. Claude or OpenAI. Zero boilerplate.
from shapellm import extract
data = extract(
"Alice is 30 years old and lives in Paris",
{"name": str, "age": int, "city": str}
)
# → {"name": "Alice", "age": 30, "city": "Paris"}That's it. No chains. No magic strings. No json.loads(response.choices[0].message.content).
pip install shape-ai
export ANTHROPIC_API_KEY=sk-ant-...Every developer building with LLMs writes the same 20 lines:
- Prompt the model to "respond only in JSON"
- Strip markdown fences from the response
json.loads()it- Hope the model didn't hallucinate the schema
- Write a retry loop because it sometimes does
shape-ai replaces all of that with one function call that uses native tool/function-calling APIs to guarantee structured output — no regex, no retries, no parsing.
from shapellm import extract
# Basic types
result = extract(
"Order #1234 shipped on Dec 5th, total was $49.99",
{"order_id": str, "date": str, "amount": float}
)
# → {"order_id": "1234", "date": "Dec 5th", "amount": 49.99}from pydantic import BaseModel
from shapellm import extract
class JobPosting(BaseModel):
title: str
company: str
salary_min: int
salary_max: int
remote: bool
job = extract(
"Senior Engineer at Stripe, $180k–$220k, fully remote",
JobPosting
)
print(job.title) # Senior Engineer
print(job.company) # Stripe
print(job.salary_min) # 180000
print(job.remote) # Trueresult = extract(
"The meeting has Alice, Bob, and Charlie attending",
{"attendees": list[str]}
)
# → {"attendees": ["Alice", "Bob", "Charlie"]}result = extract(
"Ship to: John Doe, 42 Baker St, London SW1A 1AA",
{
"recipient": str,
"address": {"street": str, "city": str, "postcode": str}
}
)
# → {"recipient": "John Doe", "address": {"street": "42 Baker St", ...}}import openai
from shapellm import extract
client = openai.OpenAI()
result = extract(
"Tesla Q3 revenue: $25.2B, net income $1.85B",
{"revenue_billions": float, "net_income_billions": float},
client=client,
provider="openai",
model="gpt-4o"
)import anthropic
from shapellm import extract
client = anthropic.Anthropic(api_key="...")
result = extract(
"Fix the login bug, assigned to @alice, priority HIGH",
{"description": str, "assignee": str, "priority": str},
client=client,
model="claude-opus-4-5"
)result = extract(
raw_medical_note,
{"diagnosis": str, "medications": list[str], "follow_up_days": int},
system="You are a medical record parser. Extract clinical information precisely."
)# Basic extraction
shape extract "Alice is 30, lives in Paris" \
--into '{"name": "str", "age": "int", "city": "str"}'
# JSON output (pipe-friendly)
shape extract "Order #1234, $49.99, shipped" \
--into '{"order_id": "str", "amount": "float", "status": "str"}' \
--output json
# With OpenAI
shape extract "text here" --into '{"x": "str"}' --provider openaishape-ai uses native tool/function calling — not prompt engineering.
| What you pass | What shape does |
|---|---|
{"name": str, "age": int} |
Converts to JSON Schema |
class Person(BaseModel) |
Reads Pydantic's JSON Schema |
| JSON Schema | Registers as a tool parameter schema |
| LLM response | Reads the guaranteed-structured tool call result |
The LLM is forced to return data matching your schema. No parsing. No retries.
| Input | Example |
|---|---|
str, int, float, bool |
{"name": str, "count": int} |
list[X] |
{"tags": list[str]} |
| Nested dict | {"address": {"city": str, "zip": str}} |
| Pydantic model | class Invoice(BaseModel): ... |
| Raw JSON Schema | {"type": "string", "enum": ["A","B"]} |
| Provider | Default model | Install |
|---|---|---|
| Anthropic (Claude) | claude-opus-4-5 |
pip install shape-ai |
| OpenAI | gpt-4o |
pip install shape-ai[openai] |
More providers (Ollama, Gemini, Mistral) coming soon — PRs welcome.
Parse job postings from a scraper:
jobs = [extract(post, JobPosting) for post in raw_job_posts]Extract entities from support tickets:
ticket_data = extract(ticket_text, {
"category": str,
"severity": str,
"affected_user_id": str,
"steps_to_reproduce": list[str]
})Structure unstructured medical notes:
note = extract(clinical_text, MedicalRecord, system="Clinical parser. Be precise.")Clean messy CSV data:
clean_rows = [extract(row, AddressSchema) for row in messy_addresses]shape-ai/
├── shapellm/
│ ├── core.py # extract() — the single public function
│ ├── schema.py # dict / Pydantic → JSON Schema conversion
│ ├── cli.py # shape extract CLI
│ └── providers/
│ ├── anthropic.py # Tool-use forced structured output
│ └── openai.py # Function-calling forced structured output
└── tests/
├── test_schema.py # Schema conversion (no API key needed)
└── test_providers.py # Provider adapters with mocks
MIT © bhupendra05
Built because json.loads(response.strip("```json\n").strip("```")) is not a data pipeline.