Multi‑LLM citizens compete across ticks to top the leaderboard. Each agent receives a structured world briefing, calls tools (mail, messaging, action executor), commits exactly one action, and the engine records everything to SQLite + Langfuse + JSON season archives.
| Layer | Purpose |
|---|---|
| app/main.py | FastAPI dashboard + REST endpoints (/, /state, /ticks/run). Renders logs, leaderboard graph, agent snapshots, conversations, market table, and triggers tick batches. |
| app/runtime/engine.py | Core orchestrator. Loads config, seeds services, runs agent turns, advances the world graph, enforces GAME_MAX_TICKS, and writes season archives (data/seasons/season-*.json/.jpg). |
| society/graph.py | Deterministic “world step”: scoring, civic events, leaderboard digests. |
| society/agents/agent_process.py | LangChain wrapper for each agent. Builds prompts, handles memories (JSON files under data/memory/<agent>/), runs the tool-enabled LLM, snapshots context, and logs to Langfuse. |
| society/tools/actions.py | Canonical action executor. Mutates bank, marketplace, needs, stress, and handles passive-income perks. |
| society/services/ | Subsystems (bank, items, assets, marketplace, social mail + peer scores, legal, economy, rules reference). Marketplace automatically creates “showcase” listings for every owned asset so trading partners know who owns what. |
| app/agents/ | Preset definitions (multi-model clones vs. single-model personas) plus tool registry. |
| data/ | SQLite database (data/society.sqlite), agent personas, memory stores, and season archives. |
All long-running state is persisted, so restarting the FastAPI server picks up the same season.
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
cp .env.example .env # or create a .env with keys below
uvicorn app.main:app --reloadMinimum .env snippet:
SIM_AGENT_PRESET=multi_llm_competition # or multi_persona_single_model
GAME_MAX_TICKS=10
AGENT_GOAL_INTERVAL=10
SUMMARY_MODEL=gpt-4o-mini
LANGFUSE_PUBLIC_KEY=...
LANGFUSE_SECRET_KEY=...
LANGFUSE_HOST=https://cloud.langfuse.com
GET /ticks/run?count=N(or dashboard buttons) advances the world by N turns unless the season already ended.GET /statedumps the fullSimulationState.- After
GAME_MAX_TICKS, the engine writesdata/seasons/season-<timestamp>.json(complete summary) and a JPG leaderboard trajectory for posterity.
Each tick the agent receives:
- Current cash, needs (food/social/energy), stress, inventory, public inventories, market listings (including price-less showcases), goal block, leaderboard snapshot, mail inbox, and recent reports.
- Tool access is intentionally minimal:
check_mail— only if unread letters are expected.send_message— free, unlimited postal messages (arrive next tick, boost social need).choose_action— exactly one action per tick.
| Action | Notes |
|---|---|
think |
Reflection, reduces stress, minor energy/social cost. |
work |
+20 credits/hour (scaled by perks). Drains needs, raises stress; no extra payouts unless dictated by founded-business random events. |
buy_food |
Refills food to 100 %. Triggers Restaurant royalty (+10 credits to restaurant owner). |
rest |
Restores energy/social, lowers stress. |
craft |
Burns energy/social, creates random catalog item, auto-publishes a showcase listing for offers. |
found_business |
name, description, investment |
transfer_money |
Sends credits; awards +3 civic engagement points (MoneyTransferred). |
list_item |
Posts asset for sale (price > 0). Replaces prior listing and removes showcase entry. |
buy_listing |
Purchases at ask price (showcase listings are not buyable). |
make_offer |
Propose trades for any listing. Offers can include cash amounts and are reviewed next tick. |
accept_offer / decline_offer |
Resolves open offers on your listings. |
sell_business |
Remove a founded business from play and reclaim the original investment immediately. |
rob, attack, marry, divorce, idle |
Narrative/legal hooks with stress or social side-effects. |
Showcase listings: entries with “price not posted (showcase—make an offer)” just tell the world who owns the asset. Use
make_offeror send a postal note to negotiate; nothing is free.
- Business — founded via the
found_businessaction. After paying the investment (and providing a name/description), each tick has a 50 % chance to pay half the stake and a 12.5 % chance to demand half the stake again (draining the owner’s balance down to zero if necessary). Businesses automatically publish showcase listings so others can negotiate, andsell_businessrefunds the original investment if the owner wants to exit. - Restaurant — a catalog item still in circulation; whenever anyone executes
buy_food, the restaurant owner collects 10 credits.
Catalog entries seed at $50 by default (Business/Restaurant default seed price is $100) and showcases keep ownership visible.
- Metrics:
cash,wellness,social(peer ratings),civic_engagement. - Civic events:
ConversationMessage(+1 point) andMoneyTransferred(+3 points). - Victory: highest cumulative leaderboard score after
GAME_MAX_TICKS.
- Tick T — bidder calls
make_offerwithlisting_idandprice(can reference a showcase). Optionally mention complementary trades viasend_message. - Tick T+1 — seller sees the offer in the dashboard & agent prompt; decides via
accept_offerordecline_offer. - Accepting transfers credits and the asset through the marketplace service; declining closes the offer.
Agents are encouraged (via prompt + UI hints) to send multi-modal offers (cash plus item-for-item swaps), making the economy more dynamic than flat purchases.
send_messageadds a queued entry (ConversationMessageQueued) delivered next tick. The dashboard shows every agent pair as chat-like bubbles (no duplicates such as “Bob ↔ Alice” and “Alice ↔ Bob” separately).- Messages boost social need and help with negotiating showcase assets or alliances.
The FastAPI UI renders:
- Recent logs — last 500 messages from the engine.
- State snapshot — JSON dump for quick inspection.
- Leaderboard — line chart from tick 0 to
GAME_MAX_TICKS, table of ranks, delta points. - Agent snapshots — cash, last action, rationale, needs, inventory summary.
- Conversations — grouped chat transcripts for every agent pair over the last 6 ticks.
- Market listings — deduplicated list showing all active listings plus showcases (with explanatory hint).
- Raw tick reports — collapsible JSON per tick for debugging.
When GAME_MAX_TICKS is reached:
data/seasons/season-<timestamp>.json— includes final leaderboard, leaderboard history, every tick report, world events log, agent states, and metadata (winner, season length).data/seasons/season-<timestamp>.jpg— matplotlib-rendered trajectory of cumulative points (one line per agent).
These artifacts make it easy to analyze past competitions or embed visuals (as shown at the top of this README).
| Script / Command | Purpose |
|---|---|
scripts/start_over.sh |
Wipes SQLite + agent memories, re-initializes DB schema, starts FastAPI server fresh. |
GET /ticks/run?count=N |
Official way to advance ticks (UI buttons call this endpoint). |
LangFuse instrumentation |
Each agent turn logs context, tool steps, and summaries for external observability. |
No background cron jobs exist—ticks only advance through API calls (or by invoking SimulationEngine.run_ticks directly).
- Memory model — Long-term memory per agent stored in JSON; short-term scratchpad is limited and gets summarized via a small helper model (default
gpt-4o-mini). - LLM routing — Each agent is configured via
app/agents/presets.py, so swapping providers (OpenAI, Anthropic, DeepSeek, Gemini, OSS) only requires editing the preset. - Marketplace — Assets are tracked in
society/services/assets.py. Showcases ensure visibility; actual sales require listings or offers. Additional item types/perks can be added by extendingCATALOG. - Season finishing —
SimulationEngineprevents extra ticks after max length but still allows UI access (no-op tick request returns immediately).
| Task | Steps |
|---|---|
| Reset the world | scripts/start_over.sh (or manually rm -rf data/memory/* && rm -f data/society.sqlite* then rerun uvicorn). |
| Switch agent roster | Edit .env SIM_AGENT_PRESET, restart FastAPI. |
| Inspect a season | Open data/seasons/season-*.json for full log; use the JPG in docs/slides. |
| Debug agent behavior | Use Langfuse trace per tick, review conversation chat blocks, or inspect SimulationEngine.get_state_snapshot(). |
| Analyze marketplace | Use dashboard table, or query SQLite market_listings / market_offers. Showcase rows (price “—”) mark ownership—use make_offer. |
- Tick — One full perceive→decide→act→reflect loop per agent plus world scoring.
- Showcase listing — Auto-generated, price-less listing for every owned asset. Not purchasable directly; exists so others know who to negotiate with.
- Season archive — Auto-generated JSON + JPG once
GAME_MAX_TICKSis reached. Contains every detail needed to replay or analyze the game. - Passive-income item – Catalog entries (Business via
found_business, Restaurant viabuy_listing) that hook into action events to pay royalties/dividends automatically.
Enjoy orchestrating the citizens! Open issues/PRs to add new personas, perks, rules, or dashboards. The entire system is built to be hackable—agents read whatever you encode in their prompt, and most behaviors live in Python modules you can extend. Happy simulating.
