Skip to content

Commit 430d44a

Browse files
Merge pull request #8 from gitmem-dev/feature/OD-734-restore-docs-source
docs: consolidate docs/ into apps/docs/ single source (OD-734)
2 parents d0464ac + c45a138 commit 430d44a

File tree

13 files changed

+1462
-1587
lines changed

13 files changed

+1462
-1587
lines changed
Lines changed: 140 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,140 @@
1+
---
2+
title: Local Storage
3+
description: How GitMem uses the local filesystem, what persists across sessions, and container deployment considerations.
4+
---
5+
6+
import { Callout } from 'fumadocs-ui/components/callout'
7+
8+
# Local Storage
9+
10+
GitMem writes to the local filesystem for session state, caching, and free-tier data storage. This page maps exactly what lives where so you can make informed decisions about persistence — especially in containers.
11+
12+
## Storage Locations
13+
14+
| Location | What | Owner |
15+
|----------|------|-------|
16+
| `<project>/.gitmem/` | Session state, threads, config, caches | GitMem MCP server |
17+
| `~/.cache/gitmem/` | Search result cache (15-min TTL) | GitMem MCP server |
18+
19+
## File Inventory
20+
21+
```
22+
.gitmem/
23+
+-- active-sessions.json # Process lifecycle
24+
+-- config.json # Project defaults
25+
+-- sessions.json # Recent session index (free tier SOT)
26+
+-- threads.json # Thread state cache / free tier SOT
27+
+-- suggested-threads.json # AI-suggested threads
28+
+-- closing-payload.json # (ephemeral -- deleted after use)
29+
+-- cache/
30+
| +-- hook-scars.json # Local scar copy for hooks plugin
31+
+-- hooks-state/
32+
| +-- start_time # Session start timestamp
33+
| +-- tool_call_count # Recall nag counter
34+
| +-- last_nag_time # Last recall reminder time
35+
| +-- stop_hook_active # Lock file (re-entrancy guard)
36+
| +-- audit.jsonl # Hook execution log
37+
+-- sessions/
38+
+-- <session-uuid>/
39+
+-- session.json # Per-session state (scars, confirmations)
40+
```
41+
42+
**Total typical footprint: ~530KB** (dominated by `cache/hook-scars.json`).
43+
44+
## File Lifecycle
45+
46+
| File | Created | Survives Session Close? |
47+
|------|---------|------------------------|
48+
| `active-sessions.json` | `session_start` | Yes — multi-session registry |
49+
| `config.json` | First `session_start` | Yes |
50+
| `sessions.json` | `session_close` (free tier) | Yes |
51+
| `threads.json` | `session_close` | Yes |
52+
| `suggested-threads.json` | `session_close` | Yes |
53+
| `closing-payload.json` | Agent writes before close | **No** — ephemeral |
54+
| `cache/hook-scars.json` | Hooks plugin startup | Yes |
55+
| `sessions/<id>/session.json` | `session_start` | **No** — cleaned up on close |
56+
57+
## Cross-Session Data Flow
58+
59+
### What `session_start` loads
60+
61+
| Data | Pro/Dev Source | Free Source |
62+
|------|---------------|-------------|
63+
| Last session (decisions, reflection) | Supabase `sessions` | `.gitmem/sessions.json` |
64+
| Open threads | Supabase `threads` | `.gitmem/threads.json` |
65+
| Recent decisions | Supabase `decisions` | `.gitmem/sessions.json` (embedded) |
66+
| Scars for recall | Supabase `learnings` | `.gitmem/learnings.json` |
67+
| Suggested threads | `.gitmem/suggested-threads.json` | `.gitmem/suggested-threads.json` |
68+
69+
### What `recall` searches
70+
71+
| Tier | Source | Search Method |
72+
|------|--------|---------------|
73+
| Pro/Dev | Supabase `learnings` | Semantic (embedding cosine similarity) |
74+
| Pro/Dev (cached) | `~/.cache/gitmem/results/` | Local vector search (15-min TTL) |
75+
| Free | `.gitmem/learnings.json` | Keyword tokenization match |
76+
77+
### What `session_close` persists
78+
79+
| Data | Pro/Dev Destination | Free Destination |
80+
|------|--------------------|--------------------|
81+
| Session record | Supabase `sessions` | `.gitmem/sessions.json` |
82+
| New learnings | Supabase `learnings` | `.gitmem/learnings.json` |
83+
| Decisions | Supabase `decisions` | `.gitmem/decisions.json` |
84+
| Thread state | Supabase `threads` + local | `.gitmem/threads.json` |
85+
| Scar usage | Supabase `scar_usage` | `.gitmem/scar_usage.json` |
86+
| Transcript | Supabase storage bucket | Not captured |
87+
88+
## Container Deployments
89+
90+
### Ephemeral container per session
91+
92+
```
93+
Container A (session 1) -> writes .gitmem/ -> container destroyed
94+
Container B (session 2) -> fresh .gitmem/ -> no history
95+
```
96+
97+
| Tier | Cross-Session Memory | What Breaks |
98+
|------|---------------------|-------------|
99+
| **Pro/Dev** | **Works** — Supabase is SOT | Hooks plugin cold-starts each time. Suggested threads lost. Minor UX friction, no data loss. |
100+
| **Free** | **Completely broken** — all memory is local files | No scars, no threads, no session history. Each session is amnesic. |
101+
102+
### Persistent volume mount
103+
104+
```bash
105+
docker run -v gitmem-data:/app/.gitmem ...
106+
```
107+
108+
Both tiers work. Free tier: local files ARE the SOT. Pro tier: local files are caches, Supabase is SOT.
109+
110+
### Shared container (long-running)
111+
112+
Container stays alive across multiple `claude` invocations. Both tiers work. `.gitmem/` persists because the container persists.
113+
114+
## Recommendations
115+
116+
### Free tier
117+
118+
Mount a volume for `.gitmem/`:
119+
120+
```yaml
121+
volumes:
122+
- gitmem-state:/workspace/.gitmem
123+
```
124+
125+
Files that MUST persist: `learnings.json`, `threads.json`, `sessions.json`, `decisions.json`.
126+
127+
### Pro/Dev tier
128+
129+
**Nothing required.** Supabase is the source of truth. A fresh `.gitmem/` each session works — just slightly slower (cache cold start).
130+
131+
Optional for better UX:
132+
133+
```yaml
134+
volumes:
135+
- gitmem-cache:/workspace/.gitmem/cache # Avoids scar cache re-download
136+
```
137+
138+
<Callout type="info" title="Why local files exist at all on pro tier">
139+
`active-sessions.json` tracks process lifecycle (PIDs, hostnames) — inherently local. `sessions/<id>/session.json` survives context compaction when the LLM loses state. `cache/hook-scars.json` is needed by shell-based hooks that can't call Supabase directly. `closing-payload.json` avoids MCP tool call size limits.
140+
</Callout>
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
11
{
22
"title": "Concepts",
3-
"pages": ["index", "scars", "sessions", "threads", "learning-types", "tiers"]
3+
"pages": ["index", "scars", "sessions", "threads", "learning-types", "tiers", "local-storage"]
44
}
Lines changed: 178 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -1,56 +1,205 @@
11
---
22
title: Threads
3-
description: Track unresolved work across sessions with GitMem threads.
3+
description: Track unresolved work across sessions with lifecycle management, vitality scoring, and semantic deduplication.
44
---
55

6+
import { Callout } from 'fumadocs-ui/components/callout'
7+
68
# Threads
79

8-
**Threads** track unresolved work that carries across sessions. When you can't finish something in the current session, create a thread so the next session picks it up.
10+
**Threads** are persistent work items that carry across sessions. They track what's unresolved, what's blocked, and what needs follow-up — surviving session boundaries so nothing gets lost.
11+
12+
## Why Threads Exist
13+
14+
Sessions end, but work doesn't. Before threads, open items lived as plain strings inside session records. They had no IDs, no lifecycle, no way to mark something as done. You'd see the same stale item surfaced session after session with no way to clear it.
15+
16+
Threads give open items identity (`t-XXXXXXXX`), lifecycle status, vitality scoring, and a resolution trail.
917

1018
## Creating Threads
1119

1220
```
1321
create_thread({ text: "Auth middleware needs rate limiting before production deploy" })
1422
```
1523

16-
Threads include:
17-
- A unique thread ID (e.g., `t-a1b2c3d4`)
18-
- Description text
19-
- Creation timestamp
20-
- Optional Linear issue link
24+
Threads are created in three ways:
2125

22-
## Semantic Deduplication
23-
24-
GitMem uses cosine similarity (threshold > 0.85) to prevent duplicate threads. If you try to create a thread that's semantically identical to an existing one, GitMem returns the existing thread instead.
26+
1. **Explicitly** via `create_thread` — mid-session when you identify a new open item
27+
2. **Implicitly** via `session_close` — when the closing payload includes `open_threads`
28+
3. **Promoted** from a suggestion via `promote_suggestion` — when a recurring topic is confirmed
2529

2630
## Thread Lifecycle
2731

32+
Threads progress through a 5-stage state machine based on vitality scoring and age:
33+
2834
```
29-
create → surface at session_start → resolve
35+
create_thread / session_close payload
36+
|
37+
v
38+
[ EMERGING ] -- first 24 hours, high visibility
39+
|
40+
v (age > 24h)
41+
[ ACTIVE ] -- vitality > 0.5, actively referenced
42+
|
43+
v (vitality decays)
44+
[ COOLING ] -- 0.2 <= vitality <= 0.5, fading from use
45+
|
46+
v (vitality < 0.2)
47+
[ DORMANT ] -- vitality < 0.2, no recent touches
48+
|
49+
v (dormant 30+ days)
50+
[ ARCHIVED ] -- auto-archived, hidden from session_start
51+
52+
Any state --(explicit resolve_thread)--> [ RESOLVED ]
3053
```
3154

32-
1. **Create**`create_thread` during a session
33-
2. **Surface** — Open threads appear in the next `session_start` banner
34-
3. **Resolve**`resolve_thread` with a resolution note when complete
55+
### Transitions
3556

36-
## Managing Threads
57+
| Transition | Condition |
58+
|-----------|-----------|
59+
| any -> emerging | Thread age < 24 hours |
60+
| emerging -> active | Thread age >= 24 hours, vitality > 0.5 |
61+
| active -> cooling | Vitality drops to [0.2, 0.5] |
62+
| cooling -> active | Touch refreshes vitality above 0.5 |
63+
| cooling -> dormant | Vitality drops below 0.2 |
64+
| dormant -> active | Touch refreshes vitality above 0.5 |
65+
| dormant -> archived | Dormant for 30+ consecutive days |
66+
| any -> resolved | Explicit `resolve_thread` call |
3767

38-
| Tool | Purpose |
39-
|------|---------|
40-
| `list_threads` | See all open threads |
41-
| `resolve_thread` | Mark a thread as done |
42-
| `cleanup_threads` | Triage by health (active/cooling/dormant) |
68+
**Terminal states:** Archived and resolved threads do not transition. To reopen an archived topic, create a new thread.
4369

44-
### Thread Health
70+
## Vitality Scoring
4571

46-
`cleanup_threads` categorizes threads by vitality:
72+
Every thread has a vitality score (0.0 to 1.0) computed from two components:
4773

48-
- **Active** — Recently created or referenced
49-
- **Cooling** — Not referenced in a while
50-
- **Dormant** — Untouched for 30+ days (auto-archivable)
74+
```
75+
vitality = 0.55 * recency + 0.45 * frequency
76+
```
5177

52-
### Suggested Threads
78+
### Recency
5379

54-
`session_start` may suggest threads based on session context. You can:
55-
- **Promote**`promote_suggestion` converts it to a real thread
56-
- **Dismiss**`dismiss_suggestion` suppresses it (3 dismissals = permanent suppression)
80+
Exponential decay based on thread class half-life:
81+
82+
```
83+
recency = e^(-ln(2) * days_since_touch / half_life)
84+
```
85+
86+
| Thread Class | Half-Life | Use Case |
87+
|-------------|-----------|----------|
88+
| operational | 3 days | Deploys, fixes, incidents, blockers |
89+
| backlog | 21 days | Research, long-running improvements |
90+
91+
Thread class is auto-detected from keywords in the thread text ("deploy", "fix", "debug", "hotfix", "urgent", "broken", "incident", "blocker" = operational).
92+
93+
### Frequency
94+
95+
Log-scaled touch count normalized against thread age:
96+
97+
```
98+
frequency = min(log(touch_count + 1) / log(days_alive + 1), 1.0)
99+
```
100+
101+
### Status Thresholds
102+
103+
| Vitality Score | Status |
104+
|---------------|--------|
105+
| > 0.5 | active |
106+
| 0.2 - 0.5 | cooling |
107+
| < 0.2 | dormant |
108+
109+
Threads touched during a session have their `touch_count` incremented and `last_touched_at` refreshed, which revives decayed vitality.
110+
111+
## Carry-Forward
112+
113+
On `session_start`, open threads appear with vitality info:
114+
115+
```
116+
Open threads (3):
117+
t-abc12345: Fix auth timeout [ACTIVE 0.82] (operational, 2d ago)
118+
t-def67890: Improve test coverage [COOLING 0.35] (backlog, 12d ago)
119+
t-ghi11111: New thread just created [EMERGING 0.95] (backlog, today)
120+
```
121+
122+
## Resolution
123+
124+
Threads are resolved via `resolve_thread`:
125+
- **By ID** (preferred): `resolve_thread({ thread_id: "t-a1b2c3d4" })`
126+
- **By text match** (fallback): `resolve_thread({ text_match: "package name" })`
127+
128+
Resolution records a timestamp, the resolving session, and an optional note. Knowledge graph triples are written to track the resolution relationship.
129+
130+
## Semantic Deduplication
131+
132+
When `create_thread` is called, the new thread text is compared against all open threads using embedding cosine similarity before creation.
133+
134+
| Threshold | Value | Meaning |
135+
|-----------|-------|---------|
136+
| Dedup similarity | 0.85 | Above this = duplicate |
137+
138+
**Dedup methods** (in priority order):
139+
1. **Embedding-based** — cosine similarity of text embeddings (when Supabase available)
140+
2. **Text normalization fallback** — exact match after lowercasing, stripping punctuation, collapsing whitespace
141+
142+
When a duplicate is detected, the existing thread is returned (with `deduplicated: true`) and touched to keep it vital.
143+
144+
## Suggested Threads
145+
146+
At `session_close`, session embeddings are compared to detect recurring topics that should become threads.
147+
148+
### Detection Algorithm
149+
150+
1. Compare current session embedding against the last 20 sessions (30-day window)
151+
2. Find sessions with cosine similarity >= 0.70
152+
3. If 3+ sessions cluster (current + 2 historical):
153+
- Check if an open thread already covers the topic (>= 0.80) -> skip
154+
- Check if a pending suggestion already matches (>= 0.80) -> add evidence
155+
- Otherwise, create a new suggestion
156+
157+
Suggestions appear at `session_start`:
158+
159+
```
160+
Suggested threads (2) -- recurring topics not yet tracked:
161+
ts-a1b2c3d4: Recurring auth timeout pattern (3 sessions)
162+
ts-e5f6g7h8: Build performance regression (4 sessions)
163+
Use promote_suggestion or dismiss_suggestion to manage.
164+
```
165+
166+
| Action | Tool | Effect |
167+
|--------|------|--------|
168+
| Promote | `promote_suggestion` | Converts to a real thread |
169+
| Dismiss | `dismiss_suggestion` | Suppresses (3x = permanent) |
170+
171+
## Knowledge Graph Integration
172+
173+
Thread creation and resolution generate knowledge graph triples:
174+
175+
| Predicate | Subject | Object | When |
176+
|-----------|---------|--------|------|
177+
| `created_thread` | Session | Thread | Thread created |
178+
| `resolves_thread` | Session | Thread | Thread resolved |
179+
| `relates_to_thread` | Thread | Issue | Thread linked to Linear issue |
180+
181+
Use `graph_traverse` to query these relationships with 4 lenses: `connected_to`, `produced_by`, `provenance`, `stats`.
182+
183+
## Managing Threads
184+
185+
| Tool | Purpose |
186+
|------|---------|
187+
| [`create_thread`](/docs/tools/create-thread) | Create a new open thread |
188+
| [`resolve_thread`](/docs/tools/resolve-thread) | Mark a thread as done |
189+
| [`list_threads`](/docs/tools/list-threads) | See all open threads |
190+
| [`cleanup_threads`](/docs/tools/cleanup-threads) | Triage by health (active/cooling/dormant) |
191+
| [`promote_suggestion`](/docs/tools/promote-suggestion) | Convert suggestion to real thread |
192+
| [`dismiss_suggestion`](/docs/tools/dismiss-suggestion) | Suppress a suggestion |
193+
194+
## Storage
195+
196+
| Location | Purpose | Tier |
197+
|----------|---------|------|
198+
| `.gitmem/threads.json` | Runtime cache / free tier SOT | All |
199+
| `.gitmem/suggested-threads.json` | Pending suggestions | All |
200+
| Supabase `threads` table | Source of truth (full vitality, lifecycle, embeddings) | Pro/Dev |
201+
| Supabase `sessions.open_threads` | Legacy fallback | Pro/Dev |
202+
203+
<Callout type="info" title="Free vs Pro">
204+
On free tier, `.gitmem/threads.json` IS the source of truth. On pro/dev tier, it's a cache — Supabase is authoritative.
205+
</Callout>

0 commit comments

Comments
 (0)