- The Story
- Features
- Who is this for?
- How it works
- Requirements
- Quick Start
- Enable Auto-Retrieval
- After Installation
- For Existing Claude Users
- Token Savings
- Directory Structure
- Commands Reference
- Best Practices
- FAQ
- Troubleshooting
- Security
- Roadmap
- Contributing
- Credits
After using Continuous-Claude (created by parcadei), we noticed something: our CLAUDE.md files kept growing. Every time we documented something new, added a guide, or saved a configuration, the file got bigger.
The problem? Claude loads your entire CLAUDE.md on every single prompt. That 30KB file? Loaded 20+ times per session. Hundreds of thousands of tokens wasted on content Claude didn't need.
Why does this matter? Whether you're on Claude Pro ($20/month) or Pro Max ($200/month), you have a monthly token budget. Wasting thousands of tokens per prompt on irrelevant context means fewer tokens for actual thinking, coding, and building.
The solution: What if Claude could pull in just the context it needs? You ask about your database, Claude grabs the database block. You ask about deployment, Claude grabs the deployment block. Everything else stays on the shelf.
That's BloxCue - intelligent context blocks that get loaded when you need them.
BloxCue includes a purpose-built search engine optimized for context retrieval:
| Feature | Description |
|---|---|
| Porter Stemmer | Matches word variations (running → run, deployment → deploy) |
| IDF Weighting | Rare terms rank higher than common ones for better precision |
| Phrase Matching | Recognizes multi-word queries like "error handling" as phrases |
| Query Intent Detection | Adjusts results based on query type (how-to, troubleshooting, concepts) |
| Fuzzy Matching | Finds relevant blocks even with typos or partial matches |
| Memoized Stemming | LRU cache on stemmer for 50-70% faster repeated searches |
| Index Caching | In-memory cache with mtime checking eliminates repeated disk reads |
- Hooks into Claude Code's
UserPromptSubmitevent - Analyzes your prompt in real-time
- Injects only the most relevant blocks as context
- Zero manual intervention required
- Reduces context loading by ~88%
- Saves ~7,500 tokens per prompt on average
- More tokens available for Claude's reasoning
- Pure Python standard library - no pip installs required
- No external services or API calls
- Works offline, works anywhere Python 3 runs
| If you're... | BloxCue helps you... |
|---|---|
| A Claude Code user | Stop burning tokens on unused context |
| Managing multiple configs | Keep docs, guides, and configs organized and searchable |
| Working on several projects | Switch context without reloading everything |
| Hitting token limits | Save ~7,000 tokens per prompt |
| New to Claude Code | Start with good habits from day one |
Before BloxCue:
You: "How do I deploy to production?"
Claude loads: ENTIRE CLAUDE.md (34KB = ~8,500 tokens)
- Your coding standards (not needed)
- Your API documentation (not needed)
- Your 10 different project configs (not needed)
- Your deployment guide (NEEDED!)
- Everything else (not needed)
Result: ~8,500 tokens loaded, only ~800 were relevant
After BloxCue:
You: "How do I deploy to production?"
BloxCue: Detects "deploy" + "production" keywords
→ Finds deployment block via Porter stemmer
→ IDF weights "production" higher (specific term)
→ Injects only the deployment block
Claude loads: Just the deployment block (~800 tokens)
Result: ~800 tokens loaded, all relevant
Saved: ~7,700 tokens for thinking & coding
BloxCue works best alongside Continuous-Claude. They're complementary tools:
| Tool | Purpose |
|---|---|
| Continuous-Claude | Session memory (ledgers, handoffs, learnings) |
| BloxCue | Knowledge retrieval (on-demand context loading) |
Think of it this way:
- Continuous-Claude = Claude's memory (what to remember)
- BloxCue = Claude's filing cabinet (where to find it efficiently)
If you prefer manual setup, follow our Continuous-Claude v3 first.
Credit: Continuous-Claude was created by parcadei. Check out Continuous-Claude v3.
Copy and paste this to Claude:
Set up BloxCue for intelligent context management.
1. Clone https://github.com/bokiko/bloxcue to a temp location
2. Run ./install.sh and guide me through the options:
- Scope: Global, Project, or Both
- Directory structure preference
3. Set up the auto-retrieval hook in ~/.claude/settings.json
4. Create a sample block to test it works
5. Clean up the cloned repo after install
If I don't have Continuous-Claude yet, set that up first from:
https://github.com/parcadei/Continuous-Claude-v3
Claude will handle the technical details while asking for your preferences.
Follow our Continuous-Claude v3.
git clone https://github.com/bokiko/bloxcue.git
cd bloxcue./install.shThe installer will ask you:
Where to install?
- Global (
~/.claude-memory) - knowledge used across all projects - Project (
./claude-memory) - project-specific docs only - Both - recommended for most users
How to organize?
- By subject - guides, references, projects (general use)
- By project - project-a, project-b (freelancers/agencies)
- Developer - apis, databases, deployment, frontend, backend
- DevOps - servers, networking, monitoring, security
- Minimal - just docs and notes
- Custom - you specify
nano ~/.claude-memory/guides/deployment.md---
title: Production Deployment
category: guides
tags: [deployment, production, devops]
---
# Production Deployment
## Prerequisites
- SSH access to production server
- Environment variables configured
## Deploy Steps
1. Run tests locally
2. Push to main branch
3. SSH into server
4. Pull latest changes
5. Run migrations
6. Restart services
## Rollback
1. Revert to previous commit
2. Run down migrations
3. Restart servicespython3 ~/.claude-memory/scripts/indexer.pypython3 ~/.claude-memory/scripts/indexer.py --search "deployment"Required for BloxCue to work automatically.
nano ~/.claude/settings.jsonAdd to your hooks section:
{
"hooks": {
"UserPromptSubmit": [{
"hooks": [{
"type": "command",
"command": "~/.claude/hooks/memory-retrieve.sh"
}]
}]
}
}Close and reopen Claude Code for changes to take effect.
You: "How do I deploy to production?"
Claude will automatically receive your deployment block as context.
Important: BloxCue is installed, but you're still wasting tokens until you slim your CLAUDE.md!
Ask Claude to migrate your content:
My CLAUDE.md has grown too big. Help me migrate content to BloxCue blocks:
1. Read my current CLAUDE.md
2. Identify distinct topics (deployment, APIs, configs, etc.)
3. Create separate block files in ~/.claude-memory/
4. Slim my CLAUDE.md to essentials only
5. Re-index with: python3 ~/.claude-memory/scripts/indexer.py
Your CLAUDE.md should end up like this:
# My Workspace
Knowledge base at `~/.claude-memory/`.
Claude retrieves relevant context automatically via hooks.
## Essentials
- Project: MyApp
- Stack: Node.js, PostgreSQL, RedisAlready have a big CLAUDE.md file?
I have an existing CLAUDE.md file that's gotten too big.
Help me migrate it to BloxCue by:
1. Reading my current CLAUDE.md
2. Identifying distinct topics
3. Creating separate block files for each topic
4. Updating my CLAUDE.md to be minimal
- Let Claude install Continuous-Claude + BloxCue
- Start with a minimal CLAUDE.md
- Add blocks as you go
Your CLAUDE.md stays small forever because everything goes into blocks.
Real numbers from actual usage:
| Metric | Before BloxCue | After BloxCue | Saved |
|---|---|---|---|
| Tokens per prompt | ~8,500 | ~1,000 | ~7,500 |
| Tokens per session (20 prompts) | ~170,000 | ~20,000 | ~150,000 |
| Reduction | - | - | ~88% |
Saved tokens go toward:
- Deeper reasoning - Claude can think more thoroughly
- Longer sessions - Stay within context limits longer
- Faster responses - Less to process means quicker replies
~/.claude-memory/
├── guides/ # How-to guides
├── references/ # Quick reference docs
├── projects/ # Project-specific info
├── configs/ # Configuration templates
├── notes/ # General notes
└── scripts/
└── indexer.py # Search engine
~/.claude-memory/
├── client-alpha/
│ ├── requirements.md
│ ├── api.md
│ └── contacts.md
├── client-beta/
│ └── ...
└── scripts/
# Index all blocks
python3 ~/.claude-memory/scripts/indexer.py
# Search for something
python3 ~/.claude-memory/scripts/indexer.py --search "keyword"
# Search with verbose output (shows scores)
python3 ~/.claude-memory/scripts/indexer.py --search "keyword" -v
# List all indexed blocks
python3 ~/.claude-memory/scripts/indexer.py --list
# Rebuild index from scratch
python3 ~/.claude-memory/scripts/indexer.py --rebuild
# Output as JSON
python3 ~/.claude-memory/scripts/indexer.py --search "keyword" --json- Keep CLAUDE.md minimal - Just essentials, let blocks handle details
- One topic per file - Better search precision
- Use frontmatter - Title, category, and tags improve indexing
- Use descriptive tags -
[deployment, production, aws]not just[deploy] - Re-index after changes - Run the indexer after adding/editing files
Do I need Continuous-Claude?
Technically no, but recommended. Continuous-Claude handles session memory, BloxCue handles knowledge retrieval. They complement each other.
Will this work with Cursor/VS Code?
Designed for Claude Code CLI. May work with other Claude integrations that support hooks, but untested.
How is this different from a smaller CLAUDE.md?
Two key differences:
- Scalability - Your knowledge grows without growing token usage
- Relevance - Only blocks matching your query get loaded
A smaller CLAUDE.md means less information. BloxCue means the right information at the right time.
What if Claude needs multiple blocks?
The retrieval hook returns multiple relevant blocks based on keyword matching. A query about "database deployment" may return both the database block and deployment block.
Can I use project-specific docs?
Yes! You can have both:
- Global:
~/.claude-memory/for cross-project content - Project:
./claude-memory/for project-specific docs
The installer supports setting up both.
How do I back up my blocks?
They're just markdown files. Back them up however you prefer:
- Git repo (recommended)
- Cloud sync (Dropbox, iCloud, etc.)
- Any backup solution you use
# macOS
brew install python3
# Ubuntu/Debian
sudo apt install python3- Run the indexer:
python3 ~/.claude-memory/scripts/indexer.py - Check files have
.mdextension - Verify files are in the correct directory
- Check
~/.claude/settings.jsonsyntax (valid JSON?) - Verify the hook path is correct
- Restart Claude Code after changing settings
BloxCue is designed with security in mind:
| Protection | Description |
|---|---|
| Local-only | No network activity, no telemetry, no data collection |
| Path validation | Prevents directory traversal attacks |
| Input sanitization | User prompts are sanitized before processing |
| Type safety | Handles malformed data gracefully without crashes |
| Settings backup | Creates backup before modifying Claude config |
| File locking | Exclusive locks prevent index corruption from concurrent sessions |
See SECURITY.md for the full security audit report.
- Porter Stemmer for word normalization
- IDF weighting for term importance
- Bigram/phrase matching
- Query intent detection
- Path traversal protection
- Type safety hardening
- Stemmer memoization (LRU cache)
- Index caching with mtime invalidation
- File locking for concurrent safety
- Semantic search with embeddings
- VS Code extension for block management
- Web UI for managing memory
- Cross-machine sync
Ideas and contributions welcome! See the roadmap above for planned features.
- parcadei - Creator of Continuous-Claude v3
MIT - Use it however you want.
