Sumerian Agent Patterns

Empirical mining of the SumTablets cuneiform corpus (91,606 tablets, 6.97M glyphs) for software-design primitives applicable to modern multi-agent AI systems.

Sumerian scribes ran a multi-agent bureaucracy 4,000 years ago. Their clay tablets carry sealed envelopes, named time periods, periodic audits, RPC headers, and witness sets — the same primitives modern agent systems are reinventing. This repo statistically validates which patterns are real (with p-values and cited tablet IDs) and translates them into agent-framework code shapes.

Who this is for:

Multi-agent / LLM agent framework developers (LangGraph, AutoGen, CrewAI, custom runtimes)
Researchers in agentic AI, cognitive architectures, distributed systems
Anyone designing memory layers, identity/auth subsystems, or audit logs for AI agents
Cuneiform / Assyriology researchers curious about cross-disciplinary applications

Start Here — The Evidence-Backed Artifacts

Of the ~158 ideas in outputs/FULL_IDEAS.md, the following 9 first-class artifacts carry real evidence (statistical results, cited tablet IDs, or measured benchmarks). The rest of the catalog is brainstorm-grade — useful for ideation, but not load-bearing. Start here:

#	Artifact	Where	What it gives you
1	Empirical method — statistical mining of an ancient corpus to derive software-design primitives, with shuffled-baseline controls and Bonferroni correction	`scripts/phase{0,1,3}_*.py`	A reproducible pipeline you can re-run on any corpus to extract templates + structure
2	9 named agent primitives with cited tablet IDs and contracts	`outputs/primitives.json`	Single-responsibility agent designs grounded in real attestations (P/Q tablet IDs)
3	Zipf-as-DSL detector finding (Admin s=1.746, Royal s=1.737, Lexical s=1.114)	`outputs/compression_findings.md` §1	Empirical method for unsupervised "is-this-a-DSL?" classification of any corpus
4	RULING-parity finding (Royal p=0.002, Admin p=0.005)	`outputs/compression_findings.md` §4	Statistical proof that physical document boundaries map to logical row boundaries — informs vector-chunk strategy
5	ELS-null result (0 / 495 tests Bonferroni-significant)	`outputs/compression_findings.md` §3	Defensive prior art against future numerology / "hidden code" claims on cuneiform
6	Reference architecture — composes the primitives into a multi-agent design with Python + Rust pseudocode	`outputs/reference_architecture.md`	Drop-in design doc you can adapt to any agent runtime
7	Quantitative benchmark — sealed envelope vs anonymous baseline, measured	`benchmarks/RESULTS.md`	Hard numbers (+59% bytes, +37 tokens, +7µs per write) and a 5/5-vs-0/5 capability comparison
8	Reference implementation of `kishib3` (sealed envelope) in 250 LoC stdlib Python	`benchmarks/kishib3.py`	Working code for one of the primitives — clone, adapt, ship
9	5 unmined research directions from data we already have on disk	`outputs/FULL_IDEAS.md` §N (items N1, N4, N5, N9, N12)	Named-person social network (N1), region-scoped authority (N4), votive ledger pattern (N5), multi-tablet narratives (N9), kišib₃ undercount (N12)

Honest framing: the ~158-idea catalog in FULL_IDEAS.md is preserved for browsability, but most of it is restatements of these 9 artifacts in different framings, branding gimmicks (Sumerian-named libraries that are the primitives renamed), or ephemera. If you only have 30 minutes, read outputs/summary.md + benchmarks/RESULTS.md. If you have an hour, add outputs/reference_architecture.md.

Why You Should Care (Findings → Actions)

Three concrete things you can build differently after reading this:

1. Wrap every agent write in a sealed envelope

Finding. 25.4% of administrative tablets carry kišib₃ (seal of so-and-so), 74.2% are dated by year, and witness clauses (igi PN-šè) are common. No important write is anonymous, undated, or unattributed. Tablets P101440, P132611, P117793, P145759.

Action. Make every state-changing call in your agent runtime carry (payload, by_seal, witnesses, period). Audit becomes a property of the envelope, not a separate concern bolted on per agent. Replay across any time window becomes trivial.

Measured impact (benchmarks/RESULTS.md). We built a minimal sealed-envelope library (benchmarks/kishib3.py, 250 LoC, stdlib only) and ran 100,000 writes through both it and an anonymous-log baseline:

Cost: +37 tokens / write (estimated), +7 µs latency / write, +59 % bytes at 250-byte payloads (drops to ~14 % at 1 KB payloads, ~1.4 % at 10 KB).
Capability: sealed log answers 5/5 audit queries (who wrote X, all writes by principal, all writes in period, integrity verification, replay-after-cascade-revoke). Anonymous log answers 0 of 5. The capability gap is total, not partial.
The killer query: replay-as-of after revoking a parent principal. Baseline returns all 50 writes (silent drift). Sealed log returns 0 — every revoked descendant is correctly excluded.

2. Tier your agent memory as session → topic → row

Finding. Sumerian tablets have three layers of structure: physical surface (obverse/reverse), logical column, atomic row marked by <RULING>. We statistically confirmed <RULING> is a real row separator: adjacent ruling-bounded chunks share trigrams 30–500× more than shuffled baselines (Royal Inscription p=0.002, Administrative p=0.005).

Action. Replace flat agent memory with three tiers: SURFACE (session) → COLUMN (topic) → RULING (row). Reads return the smallest tier that satisfies the query — never drag back the whole session when one row answers.

3. Replace silent token/cost rollups with periodic signed audits

Finding. Administrative tablets close with šu-nigin₂ (sum-total) — a periodic reconciliation. Shortfalls and excesses are named explicitly: la₂-ia₃ (deficit owed by named person), diri (excess). Nothing drifts silently.

Action. For any ledger-shaped agent state (token usage, tool-call counts, cost tracking, evidence accumulation), close periods at fixed intervals with a signed audit. Deficits and excesses must be named and attributed to a counterparty.

Plus a defensive null result: A 99-skip × 5-genre × 1,000-shuffle ELS scan found zero hidden codes (0 of 495 Bonferroni-significant tests). Useful prior art if anyone tries to sell you "Sumerian secret-code AI."

Want more? outputs/summary.md ranks the top 10 ideas by leverage. outputs/FULL_IDEAS.md lists ~158 across 16 categories. outputs/reference_architecture.md has full code shapes in Python and Rust.

Architecture Overview

                     ┌────────────────────────────────────────┐
                     │           RoyalDecreeAgent             │
                     │  (policy/version registry, broadcasts) │
                     └────────────────┬───────────────────────┘
                                      │ subscribes
        ┌─────────────────────────────┼─────────────────────────────┐
        │                             │                             │
        ▼                             ▼                             ▼
┌────────────────┐         ┌────────────────────┐        ┌─────────────────────┐
│ TempleLedger   │ ──uses──▶ CommodityLedgerLine │        │ AddressedMessage    │
│ Agent          │         │ Agent (stateless)   │        │ Agent (RPC-on-clay) │
└──────┬─────────┘         └─────────┬──────────┘         └─────────┬───────────┘
       │ writes                      │ canonicalizes                │ delivers
       ▼                             ▼                              ▼
┌────────────────┐         ┌─────────────────────┐         ┌──────────────────┐
│ SealAuthority  │◀────────│ LexicalOntology     │         │ RitualSequence   │
│ Agent          │         │ Agent (taxonomy)    │         │ Agent (workflow) │
└──────┬─────────┘         └─────────────────────┘         └──────────────────┘
       │ identity                                                  ▲
       ▼                                                           │
┌────────────────┐                                          ┌──────┴───────────┐
│ ScribalSchool  │                                          │ YearNameRegistry │
│ Agent          │                                          │ Agent (time)     │
└────────────────┘                                          └──────────────────┘

Nine named agent primitives. Three transverse subsystems: memory tiers, identity/provenance, taxonomy. See outputs/reference_architecture.md for the full design with Python + Rust code shapes.

Concrete Example — One Tablet, Decomposed

Tablet P101440 (Ur III administrative, 39 cuneiform glyphs):

<SURFACE>
la₂-ia₃ 1(aš) gun₂ 5(u) 5(diš) ma-na siki du                  ← deficit line + commodity quantities
<unk> 5(gešʾu) 4(geš₂) 7(diš) a₂ geme₂ u₄ 1(diš)-še₃            ← labor accounting
kišib₃ {d}šul-gi-i₃-li₂                                         ← seal of Shulgi-ili
<SURFACE>
<BLANK_SPACE>
iti ezem-me-ki-gal₂                                             ← month: festival of Mekigal
mu us₂-sa ki-maš{ki} ba-...                                     ← year after the destruction of Kimaš

The same tablet, decomposed into modern primitives:

WriteEnvelope(
  payload = LedgerEntry(
    lines = [
      Line(qty=Rational(1,1), unit="gun₂", commodity="siki", instrument="deficit"),  # la₂-ia₃
      Line(qty="5(gešʾu) 4(geš₂) 7(diš)", unit="labor-day", commodity="female-worker"),
    ],
  ),
  by_seal      = SealId("shulgi-ili-001"),         # kišib₃ {d}šul-gi-i₃-li₂
  witnesses    = [],                                # none recorded on this tablet
  period_id    = PeriodRegistry.resolve(
    name="iti ezem-me-ki-gal₂",
    year=YearName(derived_from="ki-maš destruction year", offset="us₂-sa"),  # mu us₂-sa
  ),
)

This is what every line in outputs/templates.json is doing — taking a real tablet's structural pattern and showing the modern primitive it implies.

Glossary — Sumerian Terms Used

Term	Literal	Modern equivalent
`kišib₃` (PN)	seal of [person name]	Cryptographic signature / write attribution
`mu` X	year of X	Named time period (event-named, not numeric)
`mu us₂-sa` X	year after the year of X	Relative time reference resolved at write-time
`iti` X	month of X	Calendar month sub-period
`šu-nigin₂`	sum-total	Periodic audit / signed reconciliation
`la₂-ia₃`	deficit	Named outstanding obligation (never silent)
`diri`	excess	Named surplus requiring disposition
`igi` PN-šè	before [person]	Witness clause — live attestation at write-time
`dumu` PN	son of [person]	Filiation edge in principal/identity graph
`u₃-na-a-du₁₁`	speak to him	Letter address formula — RPC envelope opener
`dub-ba-ni`	his tablet	Reference to a prior message (thread-id)
`lugal`	king	Top-tier role in authority graph
`ensi₂`	governor	Region-scoped authority role
`niga 4(diš)-kam`	grade-4 grain-fed	Numbered quality tier on a commodity (SLA tier)
`<SURFACE>`	physical face of tablet	L1 — Frame / session boundary
`<COLUMN>`	column on a surface	L2 — Section / topic boundary
`<RULING>`	drawn dividing line	L3 — Row / atomic record boundary
`<BLANK_SPACE>`	intentional gap	Semantic whitespace — preserve, don't trim

Findings (Detail)

The numbers behind the implications above. All claims here are reproducible from scripts/phase3_compression.py with seed=42.

Finding	Statistic	Genre / Coverage
Admin tablets are a domain-specific language	Zipf exponent s = 1.746 (R²=0.93)	Administrative
Royal Inscription is the most-templated genre	Compression-Δ = +0.099 vs shuffled baseline	Royal Inscription
Lexical lists are closest to natural language	Zipf s = 1.114 (R²=0.92)	Lexical
Letters are short single-purpose RPCs	Lowest marker density (0.02 RULING/tab)	Letter
`<RULING>` is a logical row separator (not visual)	p = 0.002 (Royal), p = 0.005 (Admin) for cross-ruling trigram parity	Royal, Admin
No hidden encodings in the corpus	0 of 495 ELS tests Bonferroni-significant	All genres
Seal-of-PN clauses are pervasive	25.4% of Admin tablets	Administrative
Year-formulas are universal envelopes	74.2% Admin, 65% Letter, 62.4% Royal	Across genres
Letters are addressed RPCs	`u₃-na-a-du₁₁` in 58.8% of Letters	Letter

Full statistics with shuffled-baseline controls in outputs/compression_findings.md. Per-tablet pattern citations in outputs/templates.json.

Repository Layout

.
├── README.md                          this file
├── LICENSE                            CC BY 4.0 (docs and analysis artifacts)
├── LICENSE-CODE                       MIT (Python scripts)
├── requirements.txt                   Python deps
├── scripts/
│   ├── phase0_sample.py               loads SumTablets, builds stratified samples
│   ├── phase1_templates.py            extracts genre templates and probe hits
│   └── phase3_compression.py          Zipf, compression, ELS, RULING-parity analysis
├── benchmarks/
│   ├── kishib3.py                     reference sealed-envelope implementation (~250 LoC)
│   ├── baseline_log.py                anonymous-log comparison point (~50 LoC)
│   ├── benchmark.py                   harness — overhead + capability comparison
│   ├── results.json                   raw measurement output
│   └── RESULTS.md                     report with measured numbers and caveats
└── outputs/
    ├── templates.json                 229 templates × {genre, pattern, role, frequency, tablet IDs}
    ├── primitives.json                9 named agent primitives (6 rubric + 3 data-justified)
    ├── compression_findings.md        Phase 3 statistics with p-values
    ├── phase3_raw.json                machine-readable Phase 3 metric rows
    ├── reference_architecture.md      multi-agent reference architecture with code shapes
    ├── summary.md                     top-10 ideas ranked novelty × implementability
    └── FULL_IDEAS.md                  ~158 ideas across 16 categories

How to Reproduce

pip install -r requirements.txt
python scripts/phase0_sample.py     # downloads SumTablets, persists local parquet, builds samples
python scripts/phase1_templates.py  # writes outputs/templates.json
python scripts/phase3_compression.py  # writes outputs/compression_findings.md + phase3_raw.json

All scripts are seeded (random_state=42, np.random.default_rng(42)) and reproducible end-to-end. Phase 0 downloads ~50 MB of corpus data from HuggingFace on first run, caches it locally as parquet, and reuses on subsequent runs.

reference_architecture.md, summary.md, FULL_IDEAS.md, and primitives.json are hand-authored design artifacts that cite outputs from the scripts.

Methodology

Sample. Stratified-sample 500 tablets per genre (Administrative, Literary, Lexical, Royal Inscription, Letter) plus up to 50 long-form tablets per genre. Total sample: 2,069 tablets + 197 long-form.
Templates. For each genre: structural-marker statistics (<SURFACE>, <COLUMN>, <RULING>, <BLANK_SPACE>); per-position opening/closing line templates; distinctive bigrams and trigrams via genre log-odds vs other genres; hand-coded regex probes for known bureaucratic primitives (seal-of, year-formula, total/audit, deficit, witness, etc.). Every template carries cited tablet IDs.
Compression and ELS. Per-genre Zipfian fit; compression-ratio Δ between raw and shuffled token streams; equidistant-letter-sequence (ELS) decimation at skips 2–100 with 1,000 shuffled-baseline controls (Bonferroni-corrected for 495 tests); cross-RULING trigram parity vs within-tablet shuffled baseline.
Mapping. For each empirical template, propose a named single-responsibility agent primitive with inputs, outputs, state, tools, and guardrails. Distinguish validated-by-data from speculative.

Honest Limits

We sampled 2.3% of the corpus. Findings are strong for Ur III administrative tablets and Old Babylonian literary tablets; weaker for everything else.
The corpus is 92%+ administrative — generalizing about "Sumerian thought" from this sample would be like generalizing about "civilization" from accounting receipts.
Lexical findings rely on only 69 tablets; the Lexical-list architectural slot is real but the actual taxonomic content needs to come from external sources (CDLI, ePSD2).
Sumerian seals were physical, witnessed, and socially backed — not cryptographic. The SealAuthorityAgent primitive borrows the shape (named principals, revocation, witness sets), not the threat model.
Year-names are political artifacts named after royal acts, not a neutral monotonic clock.
The general observation "Sumerian admin = proto-information-system" is well-established in popular essays. The contribution here is the empirical statistical mining with cited tablet IDs and shuffled-baseline controls — not the metaphor itself.

Citation

If you find this useful in your own work, please cite the underlying corpus:

Simmons, C., Diehl Martinez, R., & Jurafsky, D. (2024). SumTablets: A Transliteration Dataset of Sumerian Tablets. Workshop on Machine Learning for Ancient Languages (ML4AL), ACL 2024. https://aclanthology.org/2024.ml4al-1.20.pdf

Tablet IDs cited throughout (P-numbers and Q-numbers) are CDLI catalog entries and resolvable at:

License

Documentation and analysis artifacts (outputs/*, *.md): CC BY 4.0. Use freely with attribution.
Python scripts (scripts/*): MIT.

Contributing

Issues and PRs welcome — particularly:

Cross-validation against CDLI / Oracc / ePSD2
Extension to Akkadian or other periods
Counter-examples to any cited template
Bug fixes in the analysis scripts
Independent reproduction of Phase 3 statistics

Acknowledgements

Built on top of the SumTablets corpus (Simmons et al., 2024, CC BY 4.0) and the broader work of the Cuneiform Digital Library Initiative, Oracc, ETCSL, ePSD2, and the cuneiform NLP community.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sumerian Agent Patterns

Start Here — The Evidence-Backed Artifacts

Why You Should Care (Findings → Actions)

1. Wrap every agent write in a sealed envelope

2. Tier your agent memory as session → topic → row

3. Replace silent token/cost rollups with periodic signed audits

Architecture Overview

Concrete Example — One Tablet, Decomposed

Glossary — Sumerian Terms Used

Findings (Detail)

Repository Layout

How to Reproduce

Methodology

Honest Limits

Citation

License

Contributing

Acknowledgements

About

Licenses found

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
benchmarks		benchmarks
outputs		outputs
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
LICENSE-CODE		LICENSE-CODE
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Sumerian Agent Patterns

Start Here — The Evidence-Backed Artifacts

Why You Should Care (Findings → Actions)

1. Wrap every agent write in a sealed envelope

2. Tier your agent memory as session → topic → row

3. Replace silent token/cost rollups with periodic signed audits

Architecture Overview

Concrete Example — One Tablet, Decomposed

Glossary — Sumerian Terms Used

Findings (Detail)

Repository Layout

How to Reproduce

Methodology

Honest Limits

Citation

License

Contributing

Acknowledgements

About

Topics

Resources

License

Licenses found

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages