-
Notifications
You must be signed in to change notification settings - Fork 37
feat: add memory system and repomix analysis for Claude Code #360
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add memory system and repomix analysis for Claude Code #360
Conversation
This comment has been minimized.
This comment has been minimized.
This adds a comprehensive memory system to improve Claude Code's context efficiency: - Memory system with loadable context files (.claude/context/) - 7 repomix views optimized for different tasks (backend, frontend, security, architecture) - Repomix usage guide and configuration (.claude/repomix-guide.md, .repomixignore) - Implementation plans for memory system and CLAUDE.md optimization - Heatmap visualization showing codebase density Benefits: - Reduced token usage by loading only relevant context - Faster responses with targeted context files - Better alignment with task-specific needs - Visual codebase complexity insights 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
6402cd2 to
2e201e4
Compare
Claude Code ReviewSummaryThis PR introduces a comprehensive memory system and repomix analysis framework to dramatically improve Claude Code's context efficiency. The implementation is architecturally sound and provides substantial value through:
Overall Assessment: ✅ Approved with minor recommendations The memory system is a significant quality-of-life improvement that reduces token usage, improves response accuracy, and provides on-demand context loading. However, the 20MB repository impact from XML files and lack of testing warrant attention. Issues by Severity🔵 Minor Issues1. Repository Size Impact (20MB added)Location: Issue: Adding 20MB of XML files to the repository significantly increases clone time and storage requirements:
Recommendation:
Trade-off: Keeping files in repo provides immediate availability vs. storage overhead. 2. Exposed GitHub Secrets References (False Positive)Location: XML files contain GitHub Actions workflow definitions Finding: Status: ✅ Not a security issue - These are template references, not leaked credentials. Recommendation: None required (this is expected and safe). 3. Missing Test Coverage DocumentationLocation: Context files and patterns Issue: While the context files mention testing in checklists, there's no dedicated guidance on:
Recommendation: Consider adding 4. CLAUDE.md Not Fully OptimizedLocation: Issue: The optimization plan (
Recommendation:
5. Repomix View Documentation DiscrepancyLocation: Issue: The repomix guide lists file sizes that don't match the actual analysis:
Recommendation: Update guide with actual file sizes or use size ranges. 6. Missing Validation/TestingIssue: No automated tests validate that:
Recommendation: Add CI check in follow-up: - name: Validate Memory System
run: |
# Check file references
grep -o '\.claude/[^)]*' CLAUDE.md | xargs -I {} test -f {}
# Validate XML syntax
find repomix-analysis -name '*.xml' -exec xmllint --noout {} \;Positive Highlights🌟 Excellent Architecture
🌟 High-Quality Documentation
🌟 Practical Usage Examples
🌟 Security-Conscious
🌟 Immediate Value
RecommendationsPriority 1: Consider Storage StrategyAction: Decide on repository size trade-off
Why: 20MB is significant but not catastrophic. Document the decision. Priority 2: Execute CLAUDE.md Optimization (Optional)Action: Follow steps in
Why: Plan exists but wasn't fully executed. Either complete it or document deferral. Priority 3: Add CI Validation (Follow-up)Action: Add GitHub Actions workflow to validate:
Why: Prevents documentation drift as codebase evolves. Priority 4: Testing Patterns Context (Follow-up)Action: Create
Final Verdict✅ APPROVED - This PR delivers significant value with excellent architecture and documentation quality. The minor issues are non-blocking and can be addressed in follow-up PRs. Merge Confidence: High Suggested Merge Strategy: Squash and merge with commit message: Reviewed By: Claude Code (Sonnet 4.5) |
|
|
||
| | Memory Type | Location | Use When | Example Prompt | | ||
| |-------------|----------|----------|----------------| | ||
| | **Context Files** | `.claude/context/` | Working in specific area of codebase | "Claude, load backend-development context and help me add an endpoint" | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this buy over having components/backend/AGENTS.md and/or components/backend/README.md?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no problem with both for now. I want to see about making context a "proper noun" within the ambient system, meaning bubble it up to a top level concern for users. Where we put/read the actual context is I think should be your team's choice.
Here are factors I had in mind while thinking it up.
- need to support files not tracked in git
- I like the concept of this being equivalent to notebookLM (throw all kinds of stuff in a pile).
- A well-known location means the LLM has to do less work/searching. This is a critical thing to do, as often as possible only give it precisely what it needs.
If / when best practices/standards emerge, even loose ones, we should try to adapt. The symlink agents.claude is agood example. anthropic could fix that in a second if they wanted
Yes...do we really need pregenerated repomix files here? Is that a thing people do in general? I don't understand why one would do it versus having an agentic system use tools to load things on demand. |
|
idk why i cant respond to the last comment but no, we dont.ill clean that up. |
Based on comprehensive analysis (repomix-analysis-report.md), keeping only 03-architecture-only.xml (grade 8.8/10, 187K tokens) as the reference view. Deleted 6 redundant/suboptimal views: - 01-full-context.xml (too large, 550K tokens) - 02-production-optimized.xml (excessive, 1.1M tokens) - 04-backend-focused.xml (too narrow, grade 6.6) - 05-frontend-focused.xml (too narrow, grade 6.4) - 06-ultra-compressed.xml (catastrophically large, 2.6M tokens) - 07-metadata-rich.xml (redundant with #3) Benefits: - Reduced from 19M to ~1M (94% reduction) - Single high-quality reference for Claude Code - Simpler usage pattern - Regenerate-on-demand documented Updated documentation: - .claude/repomix-guide.md - Simplified to single view - repomix-analysis-report.md - Added executive decision - CLAUDE.md - Updated Memory System section - docs/implementation-plans/*.md - Updated references 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Claude Code ReviewSummaryThis PR successfully simplifies the memory system from 7 repomix views to a single, high-quality architecture view (grade 8.8/10, 187K tokens). The comprehensive analysis justifies the decision with quantitative metrics, and the resulting 94% repo size reduction is a significant improvement. The integration with CLAUDE.md is clean and maintains consistency with the existing memory system from PR #359. Overall Assessment: ✅ Approved with Minor Recommendations The single-view approach is well-reasoned, thoroughly documented, and provides clear value. The implementation is solid with only minor documentation and process improvements suggested. Issues by Severity🔵 Minor Issues1. Implementation Plan References Outdated 7-View Approach Files: Issue: Both implementation plans still extensively reference the 7-view approach that was analyzed and rejected. While there are notes at the top indicating simplification to single view, the bulk of the content still describes the multi-view strategy. Recommendation:
Impact: Low - Documentation is marked as historical, but could confuse future contributors 2. .repomixignore Pattern Redundancy File: Issue: The patterns for Python virtual environments are highly redundant (10 lines for venv variations) Recommendation: Consolidate using wildcard suffixes to reduce duplication while maintaining identical functionality. Impact: Minimal - Current patterns work correctly, this is just a maintenance improvement 3. Missing Regeneration Schedule Automation File: Issue: The guide recommends regenerating the architecture view monthly but provides no automation or GitHub Actions workflow to enforce this. Recommendation: Add a GitHub Actions workflow that runs monthly via cron schedule, regenerates the XML, and creates a PR if changes detected. Impact: Low - Manual regeneration works, but automation improves maintenance 4. Large Binary File in Repository (PNG Heatmap) File: Issue: Binary image file (PNG) committed directly to the repository. While the visual comparison is valuable, binary files in git increase repo size over time and don't diff well. Recommendations: Convert to SVG using matplotlib/seaborn for text-based, diff-friendly format, or move to Git LFS, or link to external hosting. Impact: Minimal - 100-200KB image doesn't significantly impact repo, but sets precedent for binary file management 5. No Token Count Validation in CI Issue: The PR claims 187K tokens for the architecture view, but there's no CI validation that regeneration stays within acceptable token limits (100K-200K as stated in the analysis). Recommendation: Add a CI check that validates token count and fails if exceeds threshold (e.g., 250K tokens). Rationale: Prevents accidental token bloat as codebase grows, maintaining the "optimal for context windows" benefit. Impact: Low - Manual verification works, but automation prevents regressions Positive Highlights✅ Excellent Quantitative Analysis - Comprehensive evaluation of 7 configurations with detailed scoring across 6 criteria ✅ Clean Integration with Existing Memory System - Seamless continuation of PR #359's work ✅ Significant Repo Size Reduction - 94% reduction (19M → 1M) improves clone times and developer experience ✅ Well-Structured Documentation - Clear, concise, and actionable with practical usage examples ✅ Proper File Organization - Exclusion rationale documented inline, implementation plans preserved for historical context ✅ XML File Quality - 132 files correctly included (verified), 21,767 lines with proper structure, CLAUDE.md correctly embedded ✅ Coldstartable Documentation - New contributors can understand the approach without prior context ✅ Single View Simplicity - Eliminates decision fatigue, reduces cognitive load, 95% use case coverage with one artifact RecommendationsPriority 1: Documentation ConsistencyAction: Update or archive the implementation plans to reflect the implemented single-view approach rather than the analyzed-but-rejected 7-view strategy. Why: Reduces confusion for future contributors and maintains documentation accuracy. Priority 2: Automation for MaintenanceAction: Add GitHub Actions workflows for monthly repomix regeneration and token count validation in CI. Why: Ensures the architecture view stays current and maintains token efficiency as the codebase evolves. Priority 3: Binary File ManagementAction: Convert heatmap PNG to SVG or move to Git LFS for better version control hygiene. Why: Sets good precedent for handling binary artifacts and improves diff visibility. ConclusionThis is a high-quality PR that demonstrates thoughtful analysis and pragmatic decision-making. The single-view approach is well-justified, and the implementation is clean. The minor issues identified are primarily about long-term maintenance and process improvements rather than functional problems. Recommendation: Merge after addressing Priority 1 (documentation consistency) or create follow-up issues for the maintenance automation suggestions. Review completed by Claude Code (Sonnet 4.5) on 2025-12-02 |
I'm still skeptical of this repomix thing. Yes it's better to not have the entire codebase duplicated, but even just having the architecture duplicated seems highly likely to lead to potentially dangerous skew where the repomix file says one thing and the thing most edits will operate on says another... Also tangential but it seems that because the default merge operation on this repository is "squash" (instead of "create merge commit" or "rebase") is the only reason we didn't end up with one git commit adding 20M of binary data and then a second deleting it. |
## Summary Implements Phase 2 of the Amber workflow transparency initiative: automatic PR reviews using the memory system from PRs #359 and #360. ## Changes ### New Workflow: `.github/workflows/amber-auto-review.yml` **Automatic PR reviews with repository intelligence:** - **Trigger**: Automatic on PR open/synchronize - **Memory System**: Loads 7 files for comprehensive repository context - CLAUDE.md (master project instructions) - backend-development.md (Go backend, K8s patterns) - frontend-development.md (NextJS, Shadcn UI, React Query) - security-standards.md (Auth, RBAC, token handling) - k8s-client-usage.md (User token vs service account) - error-handling.md (Consistent error patterns) - react-query-usage.md (Data fetching patterns) - **Transparency**: Every review includes: - Link to workflow run (90-day log retention) - Collapsible section showing which memory files were loaded - Clear explanation of how Amber applies repository standards - **Comment Management**: Minimizes old review comments before posting new one ### Modified Workflow: `.github/workflows/claude-code-review.yml` **Changed from automatic to manual-only:** - **Before**: Triggered automatically on all PRs - **After**: Only triggers on `@claude` mentions in comments - **Reason**: Separate automatic reviews (Amber) from manual reviews (Claude) ## Three-Tier AI Integration This completes the separation of AI review types: 1. **`@claude`** (Lightweight) - Manual, quick interactions - No repository wrapping, fastest response - Use for: questions, exploration, explanations 2. **Automatic PR Reviews** (Amber-Wrapped) - NEW with this PR - Applies CLAUDE.md + memory system standards - Shows decision process via workflow logs - Trigger: Every PR open/sync (automatic) 3. **`@amber` mentions** (Repository-Aware) - Future Phase 3 - Planned: Manual `@amber` mentions for specific tasks - Will apply same memory system as automatic reviews ## Testing Tested on fork with PR: jeremyeder#33 After merge, every new PR will trigger `amber-auto-review.yml`. To verify immediately: 1. Merge this PR 2. The workflow should trigger on this PR itself (meta!) 3. Verify review comment appears with transparency section 4. Check workflow logs via the link in the review ## Design Decisions ### Why Direct File Loading (No Shell Script) **Rejected approach**: Shell script prompt builder using sed/grep/perl to extract CLAUDE.md sections **Chosen approach**: Direct file loading via Claude's Read tool - Simpler implementation - More maintainable (change memory files, workflow stays the same) - Leverages built-in Claude Code capabilities - No fragile shell script parsing ### Why Memory System Over CLAUDE.md Extraction The memory system (PRs #359, #360) provides: - Curated, focused context for specific domains - Easier to maintain and update - Better separation of concerns - Loadable on-demand for targeted work ## Impact **Users will see**: Every PR automatically gets a repository-aware code review showing: - Issues categorized by severity (Blocker/Critical/Major/Minor) - Positive highlights - Prioritized recommendations - Full transparency into AI decision-making **Developers will benefit from**: - Consistent application of project standards - Early detection of anti-patterns - Confidence in AI reviews (visible decision process) - Reduced review burden on human reviewers ## Related - **Builds on**: PRs #359, #360 (Memory System) - **Part of**: Amber Workflow Transparency Initiative - **Documentation**: Will be updated in follow-up PR --- 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>
Summary
This PR introduces a simplified memory system with a single, high-quality repomix architecture view for Claude Code context loading.
Executive Decision: Single View Approach
After comprehensive analysis of 7 different repomix configurations (see
repomix-analysis/repomix-analysis-report.md), we adopted a single-view approach using only03-architecture-only.xml.Why one view?
See the analysis heatmap for visual comparison.
What's Included
Memory System Files:
.repomixignore- Configuration for repomix generationdocs/implementation-plans/- Design documentation (2 files)repomix-analysis/03-architecture-only.xml- The single reference viewrepomix-analysis/repomix-analysis-report.md- Comprehensive analysisrepomix-analysis/repomix-heatmap.png- Visual quality comparisonDocumentation:
.claude/repomix-guide.md- Usage guide (updated for single view)CLAUDE.md- Memory System section (updated)Files Deleted
Based on analysis, removed 6 redundant/suboptimal views:
01-full-context.xml(2.1M, 550K tokens) - Too large, poor token efficiency02-production-optimized.xml(4.2M, 1.1M tokens) - Excessive, unusable04-backend-focused.xml(403K, grade 6.6) - Too narrow, missing cross-component context05-frontend-focused.xml(767K, grade 6.4) - Too narrow, missing cross-component context06-ultra-compressed.xml(10M, 2.6M tokens) - Catastrophically large07-metadata-rich.xml(849K, grade 8.3) - Redundant with Epic: Data Source Integration #3Benefits
✅ Reduced repo size - 19M → 1M (94% smaller)
✅ Single high-quality reference - Grade 8.8/10, best of all configurations
✅ Optimal token efficiency - 187K tokens fits comfortably in context windows
✅ Simpler usage pattern - No need to choose between 7 views
✅ Comprehensive coverage - All components, READMEs, types, manifests
✅ Visual analysis included - Heatmap shows why #3 is best
Usage Example
Relationship to PR #359
Together they form the complete memory system for efficient Claude Code context loading.
Test Plan
🤖 Generated with Claude Code