Patterns - Build data systems from re-usable sql and python components
The Patterns Devkit is a CLI and lightweight SDK to build, version, and deploy data graphs made of reusable SQL and Python nodes. It helps you:
- Scaffold apps (graphs) and nodes quickly
- Define connections between nodes and storage tables in
graph.yml - Manage secrets and configuration
- Upload, list, and trigger runs in the Patterns platform
Documentation: https://www.patterns.app/docs/devkit
- Create graphs and nodes (
python,sql, subgraphs) from the CLI - Describe graph topology declaratively in
graph.yml - Write nodes using
patterns.Table,patterns.Parameter, andpatterns.State - Manage secrets, auth, and uploads to the Patterns platform
- Trigger and inspect graphs remotely
pip install patterns-devkit
- Create an app (graph)
patterns create app my-leads-app
cd my-leads-appThis creates:
my-leads-app/
graph.yml
- Add two Python nodes
patterns create node --title "Ingest Leads" ingest_leads.py
patterns create node --title "Score Leads" score_leads.pyThis adds:
my-leads-app/
graph.yml
ingest_leads.py
score_leads.py
- Wire the graph in
graph.yml
Open graph.yml and connect node inputs/outputs to tables:
title: Leads Scoring
stores:
- table: raw_leads
- table: scored_leads
functions:
- node_file: ingest_leads.py
title: Ingest Leads
trigger: manual
outputs:
leads: raw_leads
- node_file: score_leads.py
title: Score Leads
inputs:
leads: raw_leads
outputs:
scored: scored_leads- Implement the nodes
ingest_leads.py (writes raw leads):
from patterns import Table, Parameter
def run():
# Optionally parameterize where to ingest from
source = Parameter("leads_source", description="Lead source label", type=str, default="marketing_form")
raw_leads = Table("raw_leads", mode="w", description="Raw inbound leads")
# Provide schema and helpful ordering for downstream streaming if desired
raw_leads.init(
schema={"id": "Text", "email": "Text", "source": "Text", "created_at": "Datetime"},
unique_on="id",
add_created="created_at",
)
# Replace this with real ingestion (API/CSV/etc.)
sample = [
{"id": "L-001", "email": "user1@example.com", "source": source},
{"id": "L-002", "email": "user2@corp.com", "source": source},
{"id": "L-003", "email": "ceo@enterprise.com","source": source},
]
raw_leads.upsert(sample)score_leads.py (reads raw leads, writes scored leads):
from patterns import Table
def lead_score(email: str) -> float:
# Simple heuristic: enterprise domains score higher
domain = email.split("@")[-1].lower()
if domain.endswith("enterprise.com"):
return 0.95
if domain.endswith("corp.com"):
return 0.8
return 0.4
def run():
raw = Table("raw_leads") # read mode by default
scored = Table("scored_leads", "w") # write mode
scored.init(
schema={"id": "Text", "email": "Text", "score": "Float", "created_at": "Datetime"},
unique_on="id",
add_created="created_at",
)
rows = raw.read() # list[dict] or dataframe if configured
for r in rows:
r["score"] = lead_score(r["email"])
scored.upsert(rows)- Visualize the example graph topology
flowchart TD
A["Ingest Leads (Python)"] -->|raw_leads| B["Score Leads (Python)"]
B -->|scored_leads| C[(scored_leads)]
- Authenticate and upload
- Sign up or sign in at
https://studio.patterns.app - Authenticate the CLI:
patterns login- Upload your graph:
patterns upload- Trigger runs
# Trigger any node by title or id (see list commands below to find ids)
patterns trigger node "Ingest Leads"
patterns trigger node "Score Leads"patterns create app <dir>: scaffold a new app directory withgraph.ymlpatterns create node <file.py|file.sql|graph.yml>: add a function node (Python/SQL/subgraph)patterns create node --type table <table_name>: add a table storepatterns create secret <name> <value>: create an organization secretpatterns upload: upload current app to the platformpatterns list apps|nodes|webhooks|versions: list resourcespatterns trigger node <title|id>: manually trigger a nodepatterns download <app>: download app contents from the platformpatterns update: update local metadata from remotepatterns delete <resource>: delete remote resourcespatterns config --json: print CLI configurationpatterns login/patterns logout: authenticate the CLI
See full help:
patterns --help
Nodes use a small SDK provided by the platform when running:
Table(name, mode="r"|"w"): read/write table abstraction. Common methods:init(schema=..., unique_on=..., add_created=..., add_monotonic_id=...)read(as_format="records"|"dataframe", chunksize=...)read_sql(sql, ...)append(records),upsert(records),replace(records),truncate(),flush()
Parameter(name, description=None, type=str|int|float|bool|datetime|date|list, default=MISSING): declare runtime parametersState: simple key-value state for long-running or iterative jobs
For more, visit the docs: https://docs.patterns.app/docs/node-development/python/
- Prefer explicit schemas on write tables via
Table.initto control types and indexes - Use
unique_onandupsertto deduplicate reliably - Add
add_createdoradd_monotonic_idto enable robust downstream streaming - Keep node code small, composable, and parameterized for reuse
BSD-3-Clause (see LICENSE)