EmersonBraun · EmersonBraun · Apr 15, 2026 · Apr 15, 2026
@@ -0,0 +1,66 @@
+# Conventions — `@agentskit/adapters`
+
+The provider layer. Every file in this package maps one LLM or one embedding provider to AgentsKit's stable contracts.
+
+## Scope
+
+- **Chat adapters** — implement `AdapterFactory` per [ADR 0001](../../docs/architecture/adrs/0001-adapter-contract.md)
+- **Embedders** — implement `EmbedFn` per [ADR 0003](../../docs/architecture/adrs/0003-memory-contract.md)
+- **No UI.** No React, no Ink, no CLI here.
+- **No runtime logic.** No loops, no tool execution. Just transport.
+
+## Adding a new chat adapter
+
+1. Create `src/<provider>.ts`. Export a factory function that returns `AdapterFactory`.
+2. Accept configuration at construction time only: `apiKey`, `model`, `baseUrl` as needed.
+3. In `createSource`, build the request but **do not fetch yet**. Defer all I/O to `stream()` — invariant A1.
+4. In `stream()`, use the SSE utility from `src/utils.ts` if the provider speaks server-sent events. Otherwise write a parser that respects the chunk shape in `@agentskit/core`.
+5. Always end with `{ type: 'done' }`, an error chunk, or iterator return on abort — invariant A3.
+6. Yield `{ type: 'text', content }` for text deltas. Yield `{ type: 'tool_call', toolCall: { id, name, args } }` with **complete args** per invariant A5.
+7. Put provider-specific data in `chunk.metadata` (usage counts, raw response, reasoning). Consumers must not depend on its shape — A8.
+8. Re-export from `src/index.ts`.
+
+## Adding a new embedder
+
+1. Create `src/embedders/<provider>.ts`. Export a factory returning `EmbedFn`.
+2. Accept `apiKey`, `model` at construction.
+3. Return a function of `(text: string) => Promise<number[]>`.
+4. Must be stable: same input + same model = same vector. No randomness — invariant E1.
+5. Re-export from `src/embedders/index.ts` and from `src/index.ts`.
+
+## Naming
+
+- File name matches the provider: `openai.ts`, `anthropic.ts`, `gemini.ts`, etc.
+- Factory function matches the provider lowercase: `openai(opts)`, `anthropic(opts)`.
+- Options interface: `OpenAIAdapterOptions`, `AnthropicAdapterOptions`.
+- Types internal to one adapter live in the same file; shared types go in `src/types.ts`.
+
+## Testing
+
+For every new adapter:
+
+- **Contract test** using the shared `AdapterContractSuite` (when it lands — for now, at minimum run the ten invariants mentally against your implementation)
+- **Stream parsing test** with a recorded fixture (JSON file of SSE chunks) so tests are fast and deterministic
+- **Error path test** — what happens on 401, 429, 500, malformed response
+- **Abort test** — `stream()` iteration terminates when `abort()` is called mid-flight
+
+Tests live in `tests/<provider>.test.ts`.
+
+## Common pitfalls
+
+| Pitfall | What to do instead |
+|---|---|
+| Calling `fetch` from `createSource` | Defer to `stream()` |
+| Mutating the input `messages` array | Copy if you need to transform for the wire format |
+| Throwing from `stream()` on a provider error | Emit `{ type: 'error', metadata: { error } }` |
+| Streaming partial tool-call args across multiple chunks | v1 requires complete args in one chunk. Buffer internally. |
+| Exposing provider SDK types in your public API | Keep the public surface limited to `AdapterFactory` |
+
+## Review checklist for this package
+
+- [ ] Implements all ten invariants A1–A10
+- [ ] Bundle size under 20KB gzipped (tightens over time)
+- [ ] Coverage threshold holds (60% lines; aiming for 80%)
+- [ ] Contract-tested against the ten invariants
+- [ ] SSE parsing uses `src/utils.ts` helpers where possible
+- [ ] README updated if the public export surface changed
@@ -0,0 +1,54 @@
+# Conventions — `@agentskit/cli`
+
+The `agentskit` command-line interface. The entry point for people who want to try AgentsKit without writing code first.
+
+## Scope
+
+- `agentskit chat` — interactive Ink chat with any provider
+- `agentskit init` — scaffold a new project
+- `agentskit run` — execute runtime agents from the terminal
+- Future: `agentskit doctor`, `agentskit dev`, `agentskit tunnel` (tracked in Phase 1)
+
+## Adding a new command
+
+1. Create `src/commands/<name>.ts`.
+2. Export a function that takes parsed arguments and runs the command — no classes.
+3. Wire the command in `src/bin.ts` using the existing argv parser.
+4. Print help output that fits on one screen (`--help` reads as documentation).
+5. Exit cleanly with `process.exit(code)` only at the top level. Never in a library function.
+
+## Output conventions
+
+- Keep terminal output terse. One line per meaningful event.
+- Use `chalk` or Ink for color. Do not hardcode ANSI codes.
+- Respect `--quiet` and `--json` flags where applicable.
+- Errors go to stderr; structured output goes to stdout.
+
+## Flag conventions
+
+- Short form (`-p`) for frequent flags, long form (`--provider`) always present.
+- Defaults shown in `--help`.
+- Mutually-exclusive flags fail fast with a clear error.
+
+## Testing
+
+- Use `vitest` with child-process spawns for e2e coverage of the `bin.ts` entry.
+- Unit-test individual commands with mocked adapters.
+- Test fixtures live in `tests/fixtures/`.
+
+## Common pitfalls
+
+| Pitfall | What to do instead |
+|---|---|
+| Using `process.exit` in a library function | Return an exit code from the command function; only `bin.ts` calls `process.exit` |
+| Reading `process.argv` outside `bin.ts` | Pass parsed args down |
+| Hardcoding provider names | Accept `--provider <name>` and route to the right adapter |
+| Emitting unstructured text with `--json` set | Emit JSON; add `--format=json` if both are needed |
+
+## Review checklist for this package
+
+- [ ] Bundle size under 20KB gzipped
+- [ ] Coverage threshold holds (30%, climbing)
+- [ ] `--help` output is one screen and accurate
+- [ ] Spawn-based e2e test for the new command
+- [ ] Exit codes: 0 success, 1 expected failure, 2 usage error
@@ -0,0 +1,62 @@
+# Conventions — `@agentskit/core`
+
+The sacred package. Every rule here is stricter than the rest of the monorepo.
+
+## Non-negotiables
+
+- **Zero runtime dependencies.** `dependencies` in `package.json` is empty and stays empty. Never add one, not even "small".
+- **Under 10KB gzipped.** CI (`size-limit`) enforces. If you're pushing the limit, the change is too big.
+- **Contracts first.** Public types and interfaces for every contract live here — Adapter, Tool, Memory, Retriever, Skill, Runtime. Implementations live in other packages.
+- **Named exports only.** No default exports, anywhere, ever.
+- **No `any`.** Use `unknown` and narrow with type guards.
+
+## What belongs here
+
+- **Types and interfaces** for the six core contracts (ADRs 0001–0006)
+- **Shared primitives** reused by multiple packages: `createEventEmitter`, `safeParseArgs`, `consumeStream`, message-building helpers
+- **The chat controller** (`controller.ts`) — headless state machine for a chat session
+- **The agent loop core** (`agent-loop.ts`) — the substrate the runtime builds on
+
+## What does NOT belong here
+
+- Any provider SDK or API client → `@agentskit/adapters`
+- Any React hook or component → `@agentskit/react`
+- Any Ink component → `@agentskit/ink`
+- Any file I/O → `@agentskit/memory` or a package that's not zero-dep
+- Any `node:*` import that's not available on every runtime we target (edge, Deno, browser)
+
+## Adding a new primitive
+
+1. Is the thing a **contract type**? Put it in `src/types/*.ts` and re-export from `src/types/index.ts`. Write an ADR if it's cross-package.
+2. Is it a **reusable helper** used by 2+ packages? Put it in `src/primitives.ts` or a dedicated file, export from `src/index.ts`.
+3. Write unit tests that exercise only the public export. Do **not** reach into internals.
+
+Every addition raises the bundle size. Run `pnpm size` in the repo root and verify the core budget still holds.
+
+## Testing
+
+- Pure unit tests with `vitest`. Environment is `node`.
+- Avoid mocks — test real functions with real inputs.
+- Mocked adapters for stream-related tests are acceptable since the Adapter contract is the seam.
+
+## Files you can edit without an ADR
+
+- Bug fixes that don't change exported types
+- New internal helpers (not exported)
+- JSDoc improvements
+- Test additions
+
+## Files that require an ADR first
+
+- Any `src/types/*.ts` change that alters an exported type
+- Any new exported function or class
+- Anything that touches the bundle size beyond ~500 bytes gzipped
+
+## Review checklist for this package
+
+- [ ] No new runtime dependency (check `package.json`)
+- [ ] Bundle size under 10KB gzipped (`pnpm size`)
+- [ ] Coverage threshold holds (75% lines)
+- [ ] No `any` introduced
+- [ ] Named exports only
+- [ ] ADR linked if a contract changed
@@ -0,0 +1,56 @@
+# Conventions — `@agentskit/eval`
+
+Agent evaluation and benchmarking. Treats agents like production systems — scored, regressed-against, tracked over time.
+
+## Stability tier: `beta`
+
+Core `runEval(dataset)` is stable. Reporters, metrics, dataset shape may gain fields in minor bumps.
+
+## Scope
+
+- `runEval({ runtime, dataset, concurrency })` — runs a dataset, returns a report
+- Scoring helpers (exact-match, regex, LLM-as-judge)
+- Reporters (console, JSON file; more coming)
+- Types: `EvalCase`, `EvalReport`, `ScoreFn`
+
+## Design principles
+
+- **Evaluation is testing for non-determinism.** Consumers should use `vitest` or similar as the runner; this package provides the primitives.
+- **Scores are numbers in `[0, 1]`.** Boolean outcomes coerce (`true` → 1, `false` → 0).
+- **Every metric is optional**. Latency, cost, tokens — report if available, skip otherwise.
+- **Replay-first** (future): when deterministic replay lands, eval runs should be reproducible from a recorded trace.
+
+## Adding a metric
+
+1. Add the field to `EvalReport` in `src/types.ts`.
+2. Compute it in `runEval`'s aggregation loop.
+3. Make it optional — some runtimes/adapters won't have it.
+4. Document in the package README.
+
+## Adding a reporter
+
+1. Create `src/reporters/<name>.ts`.
+2. Export a factory: `export function jsonReporter(opts): Reporter`.
+3. `Reporter` has `onCase(case, result)` and `onComplete(report)` events.
+4. Keep it synchronous where possible; non-blocking where not.
+
+## Testing
+
+- Unit tests for scorers and aggregation with deterministic fixtures.
+- Integration test that runs a tiny dataset against a mock runtime end-to-end.
+
+## Common pitfalls
+
+| Pitfall | What to do instead |
+|---|---|
+| Blocking tests on real model calls | Use deterministic mock adapters |
+| Assuming every result has `tokensUsed` | Make metrics optional |
+| Scoring via string equality on LLM outputs | Use LLM-as-judge for fuzzy outputs |
+| Mutating input dataset | Treat `EvalCase[]` as read-only |
+
+## Review checklist for this package
+
+- [ ] Bundle size under 10KB gzipped
+- [ ] Coverage threshold holds (95% lines — mostly pure logic)
+- [ ] New metric documented in README
+- [ ] No hard dependency on any one adapter or reporter
@@ -0,0 +1,64 @@
+# Conventions — `@agentskit/ink`
+
+Terminal UI components for AgentsKit. Mirrors `@agentskit/react`'s surface but for Ink.
+
+## Scope
+
+- **Ink components** — `ChatContainer`, `Message`, `InputBar`, `ThinkingIndicator`, `ToolCallView`
+- **Ink hooks** — thin wrappers around `@agentskit/core` primitives for Ink-friendly consumption
+- Input handling that respects terminal raw-mode semantics
+
+## What does NOT belong here
+
+- React DOM components → `@agentskit/react`
+- Autonomous runtime → `@agentskit/runtime`
+- Anything requiring a DOM
+
+## Adding a new component
+
+1. Create `src/components/<Name>.tsx`. PascalCase.
+2. Use only `ink` primitives — `Box`, `Text`, `useInput`, `useFocus`, etc.
+3. No ANSI escape codes in component logic; let `ink` handle rendering.
+4. Re-export from `src/components/index.ts` and from `src/index.ts`.
+
+## Input handling
+
+- Use `ink`'s `useInput` hook. Do not read stdin directly.
+- Gate input on `chat.status` — block input while `streaming`.
+- Respect the `disabled` prop everywhere a component accepts user input.
+
+## Testing
+
+- `ink-testing-library@4` does **not** route stdin through `ink@7`'s input pipeline. Keyboard-input tests must mock `useInput` directly:
+
+  ```tsx
+  let captured: ((input: string, key: Key) => void) | undefined
+  vi.mock('ink', async () => {
+    const actual = await vi.importActual<typeof import('ink')>('ink')
+    return {
+      ...actual,
+      useInput: (handler) => { captured = handler },
+    }
+  })
+
+  // In tests, call captured!(input, key) directly.
+  ```
+
+- Rendering-only tests work fine with `ink-testing-library`.
+
+## Common pitfalls
+
+| Pitfall | What to do instead |
+|---|---|
+| Writing ANSI codes manually | Use `Text color={...}` |
+| Reading stdin directly | Use `useInput` |
+| Forgetting to gate input on `streaming` | Check `chat.status !== 'streaming'` before every action |
+| Assuming 80 columns | Use `useStdout` and `rows`/`columns` from it |
+
+## Review checklist for this package
+
+- [ ] Bundle size under 15KB gzipped
+- [ ] Coverage threshold holds (60% lines)
+- [ ] Uses `ink` primitives only (no raw ANSI)
+- [ ] Keyboard tests mock `useInput` per the pattern above
+- [ ] Works in narrow terminals (test at 40 columns)
@@ -0,0 +1,63 @@
+# Conventions — `@agentskit/memory`
+
+Memory backends implementing the two contracts from [ADR 0003](../../docs/architecture/adrs/0003-memory-contract.md): `ChatMemory` and `VectorMemory`.
+
+## Scope
+
+- **ChatMemory implementations**: `fileChatMemory`, `sqliteChatMemory`, `redisChatMemory`
+- **VectorMemory implementations**: `fileVectorMemory`, `redisVectorMemory`
+- Shared client helpers where reuse is genuine (`redis-client.ts`, `vector-store.ts`)
+
+## Adding a new ChatMemory backend
+
+1. Create `src/<name>-chat.ts`.
+2. Export a factory: `export function sqliteChatMemory(opts): ChatMemory`.
+3. Implement the six invariants CM1–CM6:
+   - `load()` returns a snapshot
+   - `save()` is **replace-all**, not append
+   - Ordering preserved, atomic from consumer view
+   - Empty state returns `[]`
+   - `clear` optional
+4. Re-export from `src/index.ts`.
+
+## Adding a new VectorMemory backend
+
+1. Create `src/<name>-vector.ts`.
+2. Export a factory: `export function fileVectorMemory(opts): VectorMemory`.
+3. Implement the eight invariants VM1–VM8:
+   - `store` is **upsert by id**
+   - Dimensionality is a constructor concern — reject mismatches
+   - `search` returns descending-scored
+   - `threshold` is exclusive from below
+   - `topK` is an upper bound, not a floor
+4. Re-export from `src/index.ts`.
+
+## Configuration
+
+- Connection details (file path, URL, credentials) taken at construction.
+- Do not open connections until first use — defer to `load()` / `save()` / `store()` / `search()`.
+- Provide a `close()` escape hatch for long-lived processes; the contracts don't require it but consumers appreciate it.
+
+## Testing
+
+- **In-memory fake** per contract for fast tests of consumers (`memory/fakes.ts` — not yet present, welcome to add).
+- **Integration tests** for each backend using real storage (SQLite file, Redis testcontainer).
+- **Invariant tests**: a shared test suite that every backend must pass (`MemoryContractSuite` — tracked for future).
+
+## Common pitfalls
+
+| Pitfall | What to do instead |
+|---|---|
+| Implementing `save` as append | Replace-all (CM2). Consumers send full state. |
+| Returning `null` from `load` on empty | Return `[]` |
+| Mixing embedding dimensions in one vector store | Reject mismatches at `store()` time |
+| Padding `search` results to reach `topK` | Return fewer documents; `topK` is an upper bound |
+| Opening connections at import time | Defer to first method call |
+
+## Review checklist for this package
+
+- [ ] Bundle size under 15KB gzipped
+- [ ] Coverage threshold holds (80% lines)
+- [ ] New backend tested against all relevant invariants
+- [ ] Config accepted at construction; no env reads in the factory
+- [ ] Documentation for the backend's quirks in package README