Skip to content

FJMEspelt/THE-SIMULATION

Repository files navigation

THE SIMULATION

Multi‑LLM citizens compete across ticks to top the leaderboard. Each agent receives a structured world briefing, calls tools (mail, messaging, action executor), commits exactly one action, and the engine records everything to SQLite + Langfuse + JSON season archives.

Season Leaderboard Snapshot


1. Architecture Overview

Layer Purpose
app/main.py FastAPI dashboard + REST endpoints (/, /state, /ticks/run). Renders logs, leaderboard graph, agent snapshots, conversations, market table, and triggers tick batches.
app/runtime/engine.py Core orchestrator. Loads config, seeds services, runs agent turns, advances the world graph, enforces GAME_MAX_TICKS, and writes season archives (data/seasons/season-*.json/.jpg).
society/graph.py Deterministic “world step”: scoring, civic events, leaderboard digests.
society/agents/agent_process.py LangChain wrapper for each agent. Builds prompts, handles memories (JSON files under data/memory/<agent>/), runs the tool-enabled LLM, snapshots context, and logs to Langfuse.
society/tools/actions.py Canonical action executor. Mutates bank, marketplace, needs, stress, and handles passive-income perks.
society/services/ Subsystems (bank, items, assets, marketplace, social mail + peer scores, legal, economy, rules reference). Marketplace automatically creates “showcase” listings for every owned asset so trading partners know who owns what.
app/agents/ Preset definitions (multi-model clones vs. single-model personas) plus tool registry.
data/ SQLite database (data/society.sqlite), agent personas, memory stores, and season archives.

All long-running state is persisted, so restarting the FastAPI server picks up the same season.


2. Quickstart

python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
cp .env.example .env   # or create a .env with keys below
uvicorn app.main:app --reload

Minimum .env snippet:

SIM_AGENT_PRESET=multi_llm_competition   # or multi_persona_single_model
GAME_MAX_TICKS=10
AGENT_GOAL_INTERVAL=10
SUMMARY_MODEL=gpt-4o-mini
LANGFUSE_PUBLIC_KEY=...
LANGFUSE_SECRET_KEY=...
LANGFUSE_HOST=https://cloud.langfuse.com

Running ticks

  • GET /ticks/run?count=N (or dashboard buttons) advances the world by N turns unless the season already ended.
  • GET /state dumps the full SimulationState.
  • After GAME_MAX_TICKS, the engine writes data/seasons/season-<timestamp>.json (complete summary) and a JPG leaderboard trajectory for posterity.

3. Agent Loop & Tools

Each tick the agent receives:

  • Current cash, needs (food/social/energy), stress, inventory, public inventories, market listings (including price-less showcases), goal block, leaderboard snapshot, mail inbox, and recent reports.
  • Tool access is intentionally minimal:
    1. check_mail — only if unread letters are expected.
    2. send_message — free, unlimited postal messages (arrive next tick, boost social need).
    3. choose_action — exactly one action per tick.

Supported actions (executor: society/tools/actions.py)

Action Notes
think Reflection, reduces stress, minor energy/social cost.
work +20 credits/hour (scaled by perks). Drains needs, raises stress; no extra payouts unless dictated by founded-business random events.
buy_food Refills food to 100 %. Triggers Restaurant royalty (+10 credits to restaurant owner).
rest Restores energy/social, lowers stress.
craft Burns energy/social, creates random catalog item, auto-publishes a showcase listing for offers.
found_business name, description, investment
transfer_money Sends credits; awards +3 civic engagement points (MoneyTransferred).
list_item Posts asset for sale (price > 0). Replaces prior listing and removes showcase entry.
buy_listing Purchases at ask price (showcase listings are not buyable).
make_offer Propose trades for any listing. Offers can include cash amounts and are reviewed next tick.
accept_offer / decline_offer Resolves open offers on your listings.
sell_business Remove a founded business from play and reclaim the original investment immediately.
rob, attack, marry, divorce, idle Narrative/legal hooks with stress or social side-effects.

Showcase listings: entries with “price not posted (showcase—make an offer)” just tell the world who owns the asset. Use make_offer or send a postal note to negotiate; nothing is free.

Passive-income items

  • Business — founded via the found_business action. After paying the investment (and providing a name/description), each tick has a 50 % chance to pay half the stake and a 12.5 % chance to demand half the stake again (draining the owner’s balance down to zero if necessary). Businesses automatically publish showcase listings so others can negotiate, and sell_business refunds the original investment if the owner wants to exit.
  • Restaurant — a catalog item still in circulation; whenever anyone executes buy_food, the restaurant owner collects 10 credits.

Catalog entries seed at $50 by default (Business/Restaurant default seed price is $100) and showcases keep ownership visible.

Civic engagement & leaderboards

  • Metrics: cash, wellness, social (peer ratings), civic_engagement.
  • Civic events: ConversationMessage (+1 point) and MoneyTransferred (+3 points).
  • Victory: highest cumulative leaderboard score after GAME_MAX_TICKS.

4. Offer Life Cycle

  1. Tick T — bidder calls make_offer with listing_id and price (can reference a showcase). Optionally mention complementary trades via send_message.
  2. Tick T+1 — seller sees the offer in the dashboard & agent prompt; decides via accept_offer or decline_offer.
  3. Accepting transfers credits and the asset through the marketplace service; declining closes the offer.

Agents are encouraged (via prompt + UI hints) to send multi-modal offers (cash plus item-for-item swaps), making the economy more dynamic than flat purchases.


5. Conversations & Postal System

  • send_message adds a queued entry (ConversationMessageQueued) delivered next tick. The dashboard shows every agent pair as chat-like bubbles (no duplicates such as “Bob ↔ Alice” and “Alice ↔ Bob” separately).
  • Messages boost social need and help with negotiating showcase assets or alliances.

6. Dashboard Reference

The FastAPI UI renders:

  1. Recent logs — last 500 messages from the engine.
  2. State snapshot — JSON dump for quick inspection.
  3. Leaderboard — line chart from tick 0 to GAME_MAX_TICKS, table of ranks, delta points.
  4. Agent snapshots — cash, last action, rationale, needs, inventory summary.
  5. Conversations — grouped chat transcripts for every agent pair over the last 6 ticks.
  6. Market listings — deduplicated list showing all active listings plus showcases (with explanatory hint).
  7. Raw tick reports — collapsible JSON per tick for debugging.

7. Season Archives

When GAME_MAX_TICKS is reached:

  • data/seasons/season-<timestamp>.json — includes final leaderboard, leaderboard history, every tick report, world events log, agent states, and metadata (winner, season length).
  • data/seasons/season-<timestamp>.jpg — matplotlib-rendered trajectory of cumulative points (one line per agent).

These artifacts make it easy to analyze past competitions or embed visuals (as shown at the top of this README).


8. Scripts & Utilities

Script / Command Purpose
scripts/start_over.sh Wipes SQLite + agent memories, re-initializes DB schema, starts FastAPI server fresh.
GET /ticks/run?count=N Official way to advance ticks (UI buttons call this endpoint).
LangFuse instrumentation Each agent turn logs context, tool steps, and summaries for external observability.

No background cron jobs exist—ticks only advance through API calls (or by invoking SimulationEngine.run_ticks directly).


9. Design Notes & Extensibility

  • Memory model — Long-term memory per agent stored in JSON; short-term scratchpad is limited and gets summarized via a small helper model (default gpt-4o-mini).
  • LLM routing — Each agent is configured via app/agents/presets.py, so swapping providers (OpenAI, Anthropic, DeepSeek, Gemini, OSS) only requires editing the preset.
  • Marketplace — Assets are tracked in society/services/assets.py. Showcases ensure visibility; actual sales require listings or offers. Additional item types/perks can be added by extending CATALOG.
  • Season finishingSimulationEngine prevents extra ticks after max length but still allows UI access (no-op tick request returns immediately).

10. Common Workflows

Task Steps
Reset the world scripts/start_over.sh (or manually rm -rf data/memory/* && rm -f data/society.sqlite* then rerun uvicorn).
Switch agent roster Edit .env SIM_AGENT_PRESET, restart FastAPI.
Inspect a season Open data/seasons/season-*.json for full log; use the JPG in docs/slides.
Debug agent behavior Use Langfuse trace per tick, review conversation chat blocks, or inspect SimulationEngine.get_state_snapshot().
Analyze marketplace Use dashboard table, or query SQLite market_listings / market_offers. Showcase rows (price “—”) mark ownership—use make_offer.

11. Glossary

  • Tick — One full perceive→decide→act→reflect loop per agent plus world scoring.
  • Showcase listing — Auto-generated, price-less listing for every owned asset. Not purchasable directly; exists so others know who to negotiate with.
  • Season archive — Auto-generated JSON + JPG once GAME_MAX_TICKS is reached. Contains every detail needed to replay or analyze the game.
  • Passive-income item – Catalog entries (Business via found_business, Restaurant via buy_listing) that hook into action events to pay royalties/dividends automatically.

Enjoy orchestrating the citizens! Open issues/PRs to add new personas, perks, rules, or dashboards. The entire system is built to be hackable—agents read whatever you encode in their prompt, and most behaviors live in Python modules you can extend. Happy simulating.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published