Fuzzygrep is a powerful, production-ready command-line tool for interactive fuzzy searching, exploring, and inspecting JSON and CSV files. Built with performance and user experience in mind.
- Blazing Fast: Sub-second search on 10K+ records
- Lazy Loading: Stream large files without loading everything into memory
- Smart Indexing: Trigram-based indexing for 5-10x faster searches
- Parallel Processing: Multi-core support for faster data processing
- Intelligent Caching: TTL-based caching with automatic invalidation
- Interactive Interface: Beautiful, intuitive CLI with rich formatting
- Fuzzy Search: Find what you need with typo-tolerant search
- Regex Search: Pattern matching with regular expressions (v1.1)
- Color Themes: Nord, Dracula, Solarized, and default themes (v1.1)
- Syntax Highlighting: JSON visualization with color-coded output
- Auto-completion: Smart suggestions as you type
- Export Options: Save results as JSON, CSV, Markdown, or HTML
- Deep Search: Search through nested JSON structures
- Dual Mode: Search keys, values, or both simultaneously
- Regex Mode: Toggle between fuzzy and regex search (v1.1)
- Query Bookmarks: Save and load frequent searches (v1.1)
- Key Filtering: Focus on specific data patterns
- Visualizations: Tree charts and frequency histograms
- Multi-format: JSON, CSV, YAML, and XML support (v1.1)
git clone https://github.com/anggiAnand/fuzzygrep.git
cd fuzzygrep
pip install -e .For enhanced features (streaming large files, CSV chunking):
pip install -e ".[enhanced]"For development (testing, linting, formatting):
pip install -e ".[dev]"- Python 3.9 or higher
- 5 core dependencies (automatically installed)
- Optional: ijson, pandas for large file handling
# Interactive search
fuzzygrep data.json
# Show file structure
fuzzygrep data.json --chart
# View frequency analysis
fuzzygrep data.json --histogram
# Verbose output
fuzzygrep data.json --verboseOnce in interactive mode, you have access to powerful commands:
Search Commands:
<query> Search for keys and values
File Operations:
/load <file> Load a different file
/reload Reload current file
Results Management:
/export <format> Export results (json, csv, md, html)
/save Quick save to results.json
Filtering & Configuration:
/filter <patterns> Filter keys by patterns (comma-separated)
/clear Clear active filters
/stats Show performance statistics
Navigation:
/history Show search history
/help Show help message
/exit, /quit Exit the program
| Shortcut | Action |
|---|---|
Ctrl+T |
Toggle autocompletion on/off |
Ctrl+V |
Switch between key/value completion |
Ctrl+R |
Reload data from file |
Ctrl+S |
Save last search results |
Ctrl+H |
Show help |
Ctrl+C |
Exit program |
$ fuzzygrep people.json
[people.json] Search> john
Matches in Keys:
βββββββββββ¬βββββββββββββββββ¬ββββββββ
β Key β Value β Score β
βββββββββββΌβββββββββββββββββΌββββββββ€
β name β John Doe β 95.0 β
β email β john@email.com β 82.0 β
βββββββββββ΄βββββββββββββββββ΄ββββββββ
Matches in Values:
ββββββββββββββββββ¬βββββββ¬ββββββββ
β Value β Keys β Score β
ββββββββββββββββββΌβββββββΌββββββββ€
β John Doe β name β 100 β
β john@email.com β emailβ 88.0 β
ββββββββββββββββββ΄βββββββ΄ββββββββ[data.json] Search> alice
# Export as JSON
[data.json] Search> /export json results.json
# Export as CSV
[data.json] Search> /export csv results.csv
# Export as HTML with nice formatting
[data.json] Search> /export html report.html[data.json] Search> /filter email,phone,address
Filter applied: email, phone, address
# Now searches are limited to these keys
[data.json] Search> john# Disable caching for always-fresh data
fuzzygrep data.json --no-cache
# Disable indexing for small files
fuzzygrep small.json --no-index
# Control worker threads
fuzzygrep large.json --workers 8
# Combine options
fuzzygrep data.json --no-cache --workers 4 --verbose# Tree view with depth limit
fuzzygrep data.json --chart --chart-limit 50
# Frequency analysis
fuzzygrep data.json --histogram# Use regex search mode
fuzzygrep data.json --regex
[data.json] Search> user.*@.*\.com # Regex pattern
# Use different color themes
fuzzygrep data.yaml --theme nord
fuzzygrep data.xml --theme dracula
# Interactive commands (v1.1)
[data.json] Search> /regex on # Enable regex mode
[data.json] Search> /bookmark my_query # Save current search
[data.json] Search> /bookmarks # List all bookmarks
[data.json] Search> /load-bookmark my_query # Load a bookmark
[data.json] Search> /theme solarized # Change themeFuzzygrep is built with a clean, modular architecture:
fuzzygrep/
βββ core/ # Core functionality
β βββ loaders.py # Data loading with streaming support
β βββ searcher.py # Fuzzy search with parallel processing
β βββ indexer.py # Trigram-based indexing
β βββ cache.py # Multi-layer caching system
βββ ui/ # User interface
β βββ display.py # Results visualization & export
β βββ interactive.py # Interactive session management
βββ utils/ # Utilities
β βββ errors.py # Custom exception hierarchy
β βββ logging.py # Rich logging system
βββ cli.py # CLI entry point
Loaders (core/loaders.py)
- Automatic format detection (JSON/CSV)
- Streaming for large files (>10MB)
- Memory-optimized data structures
- Graceful error handling
Searcher (core/searcher.py)
- Fuzzy matching with RapidFuzz
- Trigram-based pre-filtering
- Parallel processing support
- Smart scorer selection
- Multi-layer caching
Indexer (core/indexer.py)
- Trigram-based search index
- Fast candidate filtering
- Reduces search space by 50-90%
- Persistent index caching
Display (ui/display.py)
- Rich table formatting
- Syntax-highlighted JSON
- Tree visualizations
- Multiple export formats
Tested on a dataset of 10,000 records:
| Operation | Time | Memory |
|---|---|---|
| Load JSON | 1.2s | 45MB |
| Build Index | 0.8s | 15MB |
| Search (indexed) | 45ms | - |
| Search (no index) | 320ms | - |
| Export JSON | 0.5s | - |
- Enable indexing (default): Best for repeated searches
- Use streaming: Automatic for files >10MB
- Enable caching (default): Instant results for repeated queries
- Parallel processing (default): Faster on multi-core systems
- Filter keys: Reduce search space for faster results
Run the test suite:
# Run all tests
pytest
# With coverage report
pytest --cov=fuzzygrep --cov-report=html
# Run specific test file
pytest tests/test_searcher.py
# Verbose output
pytest -vCurrent test coverage: 85%+
# Clone repository
git clone https://github.com/anggiAnand/fuzzygrep.git
cd fuzzygrep
# Install in development mode with all dependencies
pip install -e ".[dev,enhanced]"
# Run tests
pytest
# Format code
black fuzzygrep tests
isort fuzzygrep tests
# Lint
flake8 fuzzygrep
mypy fuzzygrepfuzzygrep/
βββ fuzzygrep/ # Main package
βββ tests/ # Test suite
βββ setup.py # Package configuration
βββ requirements.txt # Dependencies
βββ README.md # Documentation
βββ CHANGELOG.md # Version history
Import Error: Missing dependencies
pip install -r requirements.txtSlow performance on large files
# Install optional dependencies
pip install ijson pandasCache issues
# Clear cache
fuzzygrep cache-clear
# Check cache stats
fuzzygrep cache-statsOut of memory errors
# Disable caching and indexing
fuzzygrep large.json --no-cache --no-indexFuzzygrep can be configured via:
- Command-line options (highest priority)
- Environment variables
- Config file
~/.config/fuzzygrep/config.toml
export FUZZYGREP_CACHE_DIR="~/.cache/fuzzygrep"
export FUZZYGREP_CACHE_TTL=300
export FUZZYGREP_MAX_WORKERS=4Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create a feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
- Follow PEP 8
- Use Black for formatting
- Add type hints
- Write docstrings
- Include tests
- YAML and XML support
- Regular expression search mode
- Query bookmarks
- Color themes (Nord, Dracula, Solarized)
- Multi-file search
- Advanced filtering (by type, score threshold)
- Excel (.xlsx) support
- Configuration file support
- GUI mode (optional)
- Real-time file watching
- Plugin system
- REST API
This project is licensed under the MIT License - see the LICENSE file for details.
Anggi Ananda
- GitHub: @anggiAnand
- RapidFuzz - Fast fuzzy string matching
- Rich - Beautiful terminal formatting
- Typer - CLI framework
- Prompt Toolkit - Interactive prompts
Made with β€οΈ by Anggi Ananda