Skip to content

bhupendra05/shape

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

shape-ai ✦

Structured output from any LLM in one line.
Dict schema or Pydantic. Claude or OpenAI. Zero boilerplate.

Python PyPI License


from shapellm import extract

data = extract(
    "Alice is 30 years old and lives in Paris",
    {"name": str, "age": int, "city": str}
)

# → {"name": "Alice", "age": 30, "city": "Paris"}

That's it. No chains. No magic strings. No json.loads(response.choices[0].message.content).


Install

pip install shape-ai
export ANTHROPIC_API_KEY=sk-ant-...

Why shape-ai?

Every developer building with LLMs writes the same 20 lines:

  1. Prompt the model to "respond only in JSON"
  2. Strip markdown fences from the response
  3. json.loads() it
  4. Hope the model didn't hallucinate the schema
  5. Write a retry loop because it sometimes does

shape-ai replaces all of that with one function call that uses native tool/function-calling APIs to guarantee structured output — no regex, no retries, no parsing.

Usage

Plain dict schema

from shapellm import extract

# Basic types
result = extract(
    "Order #1234 shipped on Dec 5th, total was $49.99",
    {"order_id": str, "date": str, "amount": float}
)
# → {"order_id": "1234", "date": "Dec 5th", "amount": 49.99}

Pydantic models

from pydantic import BaseModel
from shapellm import extract

class JobPosting(BaseModel):
    title: str
    company: str
    salary_min: int
    salary_max: int
    remote: bool

job = extract(
    "Senior Engineer at Stripe, $180k–$220k, fully remote",
    JobPosting
)

print(job.title)      # Senior Engineer
print(job.company)    # Stripe
print(job.salary_min) # 180000
print(job.remote)     # True

Lists

result = extract(
    "The meeting has Alice, Bob, and Charlie attending",
    {"attendees": list[str]}
)
# → {"attendees": ["Alice", "Bob", "Charlie"]}

Nested objects

result = extract(
    "Ship to: John Doe, 42 Baker St, London SW1A 1AA",
    {
        "recipient": str,
        "address": {"street": str, "city": str, "postcode": str}
    }
)
# → {"recipient": "John Doe", "address": {"street": "42 Baker St", ...}}

With OpenAI

import openai
from shapellm import extract

client = openai.OpenAI()

result = extract(
    "Tesla Q3 revenue: $25.2B, net income $1.85B",
    {"revenue_billions": float, "net_income_billions": float},
    client=client,
    provider="openai",
    model="gpt-4o"
)

Bring your own client

import anthropic
from shapellm import extract

client = anthropic.Anthropic(api_key="...")

result = extract(
    "Fix the login bug, assigned to @alice, priority HIGH",
    {"description": str, "assignee": str, "priority": str},
    client=client,
    model="claude-opus-4-5"
)

Custom system prompt

result = extract(
    raw_medical_note,
    {"diagnosis": str, "medications": list[str], "follow_up_days": int},
    system="You are a medical record parser. Extract clinical information precisely."
)

CLI

# Basic extraction
shape extract "Alice is 30, lives in Paris" \
  --into '{"name": "str", "age": "int", "city": "str"}'

# JSON output (pipe-friendly)
shape extract "Order #1234, $49.99, shipped" \
  --into '{"order_id": "str", "amount": "float", "status": "str"}' \
  --output json

# With OpenAI
shape extract "text here" --into '{"x": "str"}' --provider openai

How it works

shape-ai uses native tool/function calling — not prompt engineering.

What you pass What shape does
{"name": str, "age": int} Converts to JSON Schema
class Person(BaseModel) Reads Pydantic's JSON Schema
JSON Schema Registers as a tool parameter schema
LLM response Reads the guaranteed-structured tool call result

The LLM is forced to return data matching your schema. No parsing. No retries.


Supported schema types

Input Example
str, int, float, bool {"name": str, "count": int}
list[X] {"tags": list[str]}
Nested dict {"address": {"city": str, "zip": str}}
Pydantic model class Invoice(BaseModel): ...
Raw JSON Schema {"type": "string", "enum": ["A","B"]}

Supported providers

Provider Default model Install
Anthropic (Claude) claude-opus-4-5 pip install shape-ai
OpenAI gpt-4o pip install shape-ai[openai]

More providers (Ollama, Gemini, Mistral) coming soon — PRs welcome.


Real-world examples

Parse job postings from a scraper:

jobs = [extract(post, JobPosting) for post in raw_job_posts]

Extract entities from support tickets:

ticket_data = extract(ticket_text, {
    "category": str,
    "severity": str,
    "affected_user_id": str,
    "steps_to_reproduce": list[str]
})

Structure unstructured medical notes:

note = extract(clinical_text, MedicalRecord, system="Clinical parser. Be precise.")

Clean messy CSV data:

clean_rows = [extract(row, AddressSchema) for row in messy_addresses]

Architecture

shape-ai/
├── shapellm/
│   ├── core.py          # extract() — the single public function
│   ├── schema.py        # dict / Pydantic → JSON Schema conversion
│   ├── cli.py           # shape extract CLI
│   └── providers/
│       ├── anthropic.py # Tool-use forced structured output
│       └── openai.py    # Function-calling forced structured output
└── tests/
    ├── test_schema.py   # Schema conversion (no API key needed)
    └── test_providers.py # Provider adapters with mocks

License

MIT © bhupendra05


Built because json.loads(response.strip("```json\n").strip("```")) is not a data pipeline.

About

Structured output from any LLM in one line — dict schema or Pydantic, Claude or OpenAI, zero boilerplate.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages