Skip to content

renchris/claude-session-search

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

51 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

claude-session-search

Find any past Claude Code session in milliseconds — full-text search across every project, ranked like a search engine.

CI License: MIT Search latency

Why it exists · Install · Search · Resume · Ranking · Architecture

$ claude-search "rum latency"

  ╭──────────────────────────────────────────────────────────────────────╮
  │ Session Search                                       3 results · 2ms │
  ╰──────────────────────────────────────────────────────────────────────╯
   1  RUM Analysis: Friday Event Latency Report
       5mo ·   10 msgs · audit, docs, events, monitoring, +1    c1d32436
  ──────────────────────────────────────────────────────────────────────
   2  CloudWatch dashboards + p95 alarm thresholds
       2mo ·    8 msgs · docs, events, monitoring               e3d88717
  ──────────────────────────────────────────────────────────────────────
   3  Fly.io RUM latency testing, cold start bias, CLI fixes,
      cross-provider region discovery
       5mo ·   20 msgs · bugfix, deployment, monitoring, +2     08ada21a
  ╭──────────────────────────────────────────────────────────────────────╮
  │ ●  claude-search --resume N  ·  --fzf for interactive                │
  ╰──────────────────────────────────────────────────────────────────────╯

Notice rum matched sessions about monitoring and CloudWatch — the query expanded through a synonym table before it ever hit the index. See how

Everything stays on your machine. Your sessions and the search index never leave it — the only network call is the optional Haiku tagging and query-expansion, and only when you set your own ANTHROPIC_API_KEY.


Why this exists

Claude Code records every session with rich metadata — AI-written summaries, message counts, git branches, timestamps. None of it is searchable. When you need "that migration session from last week," your only tool is scrolling.

The problem today The cost
/resume shows a flat, unsearchable list Finding one session means scrolling past hundreds
Sessions are siloed per project directory No cross-project discovery — 220 projects, 220 manual checks
Summaries, branches, and tags live in unindexed JSON No keyword search, no date filtering, no fuzzy matching

This tool indexes every session into a single SQLite FTS5 database and queries it through a ranked search pipeline — so retrieval takes ~2 ms instead of minutes of scrolling.

What you can do

You want to… Run
Find sessions by keyword claude-search "bottle menu"
Search by time, in plain English claude-search "monitoring last week"
Scope to one project claude-search "@reso sync"
Filter by topic tag claude-search "#auth login"
Browse + resume interactively claude-search --fzf "migration"
Resume a result by number claude-search --resume 2
Understand why a result ranked claude-search "rum latency" --explain
Search inside the conversation claude-search "cold start" --deep-search
Check index health claude-search --stats
Tune your own search quality claude-search --analytics

Install in one command

git clone https://github.com/renchris/claude-session-search.git
cd claude-session-search
./install.sh
What the installer does (idempotent — safe to re-run)
  1. Symlinks hooks and bin/ into ~/.claude/ (edits in the repo go live immediately)
  2. Registers SessionEnd + SessionStart hooks in settings.json (backs it up first)
  3. Backfills every existing session via a 3-phase scan
  4. Tags sessions with regex and builds the FTS5 index
  5. Adds ~/.claude/bin to your PATH
  6. Schedules two background jobs (macOS launchd): a weekly backfill and a 60-second sweep
  7. Pre-compiles the Python engine for faster cold starts

Requires sqlite3 (with FTS5 — the macOS default), python3 ≥ 3.8, and jq. Optional (each unlocks a feature, none are required):

Package Unlocks
fzf Interactive picker (--fzf)
rapidfuzz Typo-tolerant fuzzy correction
anthropic Haiku tagging + LLM query expansion
yake Multi-word key-phrase extraction

Update: git pull && ./install.sh · Uninstall: ./uninstall.sh (preserves your database)


Search by keyword, date, @project, or #tag

claude-search "bottle menu"                 # Keyword search
claude-search "monitoring last week"        # Temporal: "yesterday", "march 1", "last 3 days"
claude-search "@reso sync"                  # Inline project filter
claude-search "#auth #bugfix login"         # Inline tag filters (combinable)
claude-search --after 2026-03-01 "rum"      # Explicit date range
claude-search --project reso "sync"         # Flag-style project filter
claude-search --deep-search "cold start"    # Also search inside conversation turns
claude-search --json "floor plan"           # JSON output for scripting

Natural-language dates work out of the box. yesterday, today, this week, last week, last 3 days, last 2 months, march 1, and ISO dates (2026-03-01) are extracted from the query and turned into a date filter automatically.

Abbreviations expand before the search runs. rummonitoring, cloudwatch, metrics, observability; dbdatabase, turso, sqlite; wswebsocket, soketi, pusher. The dictionary ships with 66 terms across 289 expansions and is editable in synonyms/default.json.


Resume the right session with one keystroke

claude-search --fzf "migration"

The claude-search fzf picker: typing a query narrows the ranked results live while the right-hand preview pane updates per selection, ending with Enter to resume the chosen session.

Pipes ranked results into an fzf picker with a live preview pane that re-searches as you type:

Key Action
Enter Resume the session (claude --resume <id>)
Ctrl-Y Copy the session ID to the clipboard
Ctrl-O Open the project directory
type Live re-search as you type

Prefer the keyboard from a plain search? claude-search --resume 2 resumes the 2nd result of your last search — no need to retype the query.


See exactly why every result ranked where it did

Search is a black box in most tools. Here it isn't — --explain shows the matched columns, the strategy that fired, the BM25 score, and every synonym the query expanded into:

$ claude-search "rum latency" --explain

   1  RUM Analysis: Friday Event Latency Report
       5mo ·   10 msgs · audit, docs, events, monitoring, +1    c1d32436
      ┊ 'rum' → summary, first_prompt, keywords
      ┊ 'latency' → summary, first_prompt, keywords
      ┊ 'monitoring' → first_prompt, tags, keywords
      ┊ Strategy: NEAR proximity · BM25: 14.77
      ┊ Expanded: cloudwatch, metrics, monitoring, observability, real user monitoring
The ranking model — BM25 relevance, recency, and conversation depth

Every result gets an additive score so no single signal can dominate:

score = 0.60 · BM25      (text relevance, normalized across the result set)
      + 0.15 · recency   (Gaussian decay, 30-day half-life — a gentle tiebreaker)
      + 0.10 · depth     (message-count sweet spot: 10–100 msgs rank highest)
      + 0.15 · BM25 × recency   (interaction: reward results that are both)

BM25 itself is column-weighted across 11 indexed fields — a hit in the AI-written summary counts far more than a hit in a shell command:

Column Weight Why
summary 12.0 Claude-generated description — the strongest signal
search_aliases 8.0 LLM-generated alternate phrasings of the session
tags 5.0 Semantic categories (auth, monitoring, …)
files_changed 3.5 File paths touched during the session
first_prompt 3.0 Your opening message
keywords 3.0 Auto-extracted technical terms
context_text 2.5 Your first few messages from the transcript
assistant_text 1.5 Claude's responses
commands_run 1.0 Shell commands executed
project_name 0.5 Broad project match
The query pipeline — 5 fallback strategies, widening only when results are sparse
flowchart TD
    A["query: slide-out panel last week"] --> B["normalize · strip @project #tag"]
    B --> C["temporal extraction → date range"]
    C --> D["tokenize · stopwords · verb + synonym expansion"]
    D --> E["fuzzy typo correction"]
    E --> F{"NEAR · within 5 words"}
    F -->|sparse| G{"AND · all terms"}
    G -->|sparse| H{"OR · any term"}
    H -->|sparse| I{"core terms only"}
    I -->|sparse| J{"prefix match"}
    F -->|enough hits| K["BM25 · 11 weighted columns"]
    G -->|enough hits| K
    H -->|enough hits| K
    I -->|enough hits| K
    J --> K
    K --> L["additive score:<br/>relevance + recency + depth"]
    L --> M["ranked results"]
Loading

The pipeline tries the most precise strategy first and only widens when a strategy returns fewer than 3 hits — so exact phrase matches win, but a fuzzy or partial query still finds something. Two safety nets sit underneath:

  • LLM fallback — if fewer than 3 results come back, Haiku rewrites the query into 4 interpretations, each searched and merged via Reciprocal Rank Fusion (needs ANTHROPIC_API_KEY).
  • Deep search (--deep-search) — also queries a turn-level chunk index (5-turn sliding windows), so a phrase buried mid-conversation is findable even when the summary never mentions it.

The index maintains itself

You never run an indexer by hand. Hooks capture sessions as they happen; a sweep daemon catches anything the hooks miss; a backfill seeds history and self-heals weekly.

flowchart LR
    subgraph feed["Indexing · fully automatic"]
        direction TB
        H1["SessionEnd hook<br/>~200ms"]
        H2["SessionStart hook"]
        H3["Sweep daemon<br/>every 60s"]
        BF["Backfill<br/>3-phase scan"]
    end
    DB[("SQLite FTS5<br/>session-index.db")]
    Q["claude-search 'query'"]
    R["Ranked results"]
    H1 --> DB
    H2 --> DB
    H3 --> DB
    BF --> DB
    Q --> DB --> R
Loading
Where the data comes from — six sources, highest quality wins

When two sources describe the same session, the richer one overwrites the thinner one:

Source Provides Role
sessions-index.json AI summaries, message counts, branches, timestamps Richest — wins on conflict
SessionEnd hook The just-finished session + enriched transcript data Live capture (< 200 ms)
SessionStart hook The session you're opening now Live capture
60-second sweep New/changed transcripts the hooks missed Self-healing (< 500 ms idle)
history.jsonl User prompts, project paths Gap fill
Legacy entries Pre-sessionId history grouped by project + time gaps Synthetic IDs

Tag sessions for topic-based recall

Tags like database, monitoring, bottle-service, slide-out are indexed and weighted, so a search for a topic surfaces sessions whose summaries never used your exact words.

# Free, instant — 42 regex patterns (24 technology · 10 task-type · 8 domain)
./scripts/session-index-tag.sh --regex-only

# Higher accuracy via Claude Haiku — ~95% accurate, ~$0.09 for 650 sessions
ANTHROPIC_API_KEY=sk-... ./scripts/session-index-tag.sh

Measure and tune your own search

Every search is logged locally, so you can see what's slow and what's failing:

$ claude-search --analytics

  ╭────────────────────────────────────────────────────────╮
  │ Search Analytics                          531 searches │
  ├────────────────────────────────────────────────────────┤
  │ Avg latency                                      4.9ms │
  │ Zero-result                             18.5% (98/531) │
  ╰────────────────────────────────────────────────────────╯

  Top Queries
  ────────────────────────────────────────────────────────
    37×  bottle service menu                 avg 10 results
    14×  slide-out panel                     avg  6 results
     8×  floor plan                          avg  9 results

  Failed Queries  (tuning targets)
  ────────────────────────────────────────────────────────
     3×  xyznonexistent

claude-search --stats shows the complementary view — total sessions, tagged count, and a breakdown by source. --analytics --fails lists only zero-result queries, the exact set worth adding synonyms for.


Every layer degrades gracefully

No optional dependency is load-bearing. Remove any one and search still works — just with one fewer enhancement.

If this is unavailable Fallback What you lose
Haiku API key Regex-only tagging Slightly less precise tags
rapidfuzz Skip fuzzy correction Typo tolerance (synonyms still cover most)
anthropic SDK Raw HTTP, or skip LLM expansion Nothing, or the LLM query fallback
sessions-index.json Rebuild from transcripts + history.jsonl Some per-session metadata
fzf Plain ranked text output The interactive picker
yake Simpler keyword extraction Multi-word key phrases

Customize

Add domain synonyms for better recall — edit synonyms/default.json:

{ "term": "k8s", "expansions": ["kubernetes", "cluster", "pod", "deployment"], "category": "infra" }

Then reload the table with ./scripts/session-index-backfill.sh.

Command reference
Flag Effect
--fzf Interactive picker with live re-search
--resume N Resume result #N (with a query, or from the last search)
--explain Show match attribution, strategy, BM25, and expansions
--deep-search Also search within conversation turns
--after / --before YYYY-MM-DD Explicit date range
--project NAME Filter by project
--min-msgs N Minimum message count
--limit N Max results (default 10)
--json JSON output for scripting
--stats Index statistics
--analytics / --analytics --fails Search-log analytics
--preview ID Preview a single session
File layout — the repo symlinks into ~/.claude/

All installed files are symlinks, so a git pull updates your live install instantly.

claude-session-search/                      ~/.claude/
├── hooks/
│   ├── session-index-end.sh        ──────→ hooks/session-index-end.sh     (SessionEnd)
│   ├── session-index-start.sh      ──────→ hooks/session-index-start.sh   (SessionStart)
│   ├── session-index-sweep.sh      ──────→ hooks/session-index-sweep.sh   (60s daemon)
│   └── lib/session-index-helpers.sh ─────→ hooks/lib/…
├── bin/
│   ├── claude-search               ──────→ bin/claude-search             (the CLI)
│   └── session-search.py           ──────→ bin/session-search.py         (the engine)
├── scripts/
│   ├── session-index-backfill.sh           (3-phase history scan)
│   ├── session-index-tag.py                (regex + Haiku tagging)
│   └── session-index-chunk.py              (turn-based chunks for --deep-search)
├── synonyms/default.json                   (66 terms · 289 expansions)
├── config/*.plist                          (launchd: weekly backfill + 60s sweep)
├── install.sh  ·  uninstall.sh
                                            └── session-index.db          (SQLite FTS5)

License

MIT

About

Instant full-text search across all Claude Code sessions — <5ms ranked search with synonym expansion, temporal queries, and fzf interactive mode

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors