About • Key Features • Installation • Usage • Examples • Output Formats • Development • Roadmap • Credits • License
Kratio is a sophisticated keyword density analyzer that helps content creators, SEO specialists, and marketers optimize their content. It analyzes text files to identify the most frequently used words and noun phrases, providing valuable insights for content optimization and SEO strategy.
- Comprehensive Analysis: Analyzes text files to compute word frequencies and keyword density
- Multiple Analysis Types:
- Word-based analysis - identifies individual keywords and their frequency
- Noun chunk analysis - identifies phrases and compound terms
- Visualization: Generates bar chart visualizations of top keywords/noun chunks
- Multiple Output Formats: Supports table, CSV, and JSON output formats
- Batch Processing: Analyze multiple files in a directory at once
- File Format Support: Works with various text-based file formats (.txt, .md, .py, .html, .js)
- Watch Mode: Monitor files or directories and automatically re-analyze on changes
- Offline-First: No internet connection required for core functionality (except for initial spaCy model download)
To run Kratio, you'll need Python 3.12+ installed on your computer.
# Clone the repository
git clone https://github.com/jspenaq/Kratio.git
cd Kratio
# Install dependencies using uv (recommended)
pip install uv
uv sync
uv pip install -e .
# Download the required spaCy model
uv run python -m spacy download en_core_web_sm
Alternatively, you can install using pip directly:
# Clone the repository
git clone https://github.com/jspenaq/Kratio.git
cd Kratio
# Install the package and dependencies
pip install -e .
# Download the required spaCy model
python -m spacy download en_core_web_sm
Kratio can be used as a command-line tool:
# Basic usage
kratio <file_path> [options]
# Analyze a directory
kratio <directory_path> [options]
positional arguments:
path The path to the text file or directory to analyze.
options:
-h, --help show this help message and exit
--analysis_type {words,noun_chunks}
The type of analysis to perform (words or noun_chunks, default: words).
--top_n TOP_N The number of top keywords/noun chunks to display (default: 10).
--output OUTPUT Output file path to dump the DataFrame (CSV or JSON format).
--save-plot SAVE_PLOT
Path to save the visualization plot (e.g., path.png).
--no-visualization Disable visualization output.
--format {json,csv,table}
Output format for the analysis results (json, csv, or table, default: table).
--silent Suppress all non-essential output, including logging messages.
--watch Monitor the file or directory and re-run analysis on every change.
--debug Enable debug logging for troubleshooting.
kratio example.txt --analysis_type words
kratio example.txt --analysis_type noun_chunks --top_n 20
kratio example.txt --output results.csv
kratio example.txt --save-plot keyword_density.png
kratio ./content/ --analysis_type words
kratio example.txt --format json
kratio example.txt --watch
kratio example.txt --watch --debug
kratio ./content/ --watch
Kratio supports multiple output formats:
- Table (default): Displays results in a formatted table in the terminal
- CSV: Exports results to a CSV file for spreadsheet analysis
- JSON: Exports results to a JSON file for programmatic use
kratio/
├── docs/ # Documentation
├── src/ # Source code
│ └── kratio/
│ ├── cli/ # Command-line interface
│ ├── core/ # Core analysis functionality
│ ├── io/ # Input/output operations
│ ├── utils/ # Utility functions
│ └── visualization/ # Visualization components
└── tests/ # Test suite
├── integration/ # Integration tests
└── unit/ # Unit tests
# Install development dependencies
uv pip install -e ".[dev]"
# Run tests with coverage
coverage run -m pytest
coverage report
coverage html # Generates HTML report
Kratio is actively being developed with several exciting features planned:
- Competitive Analysis Module: Compare keyword densities across multiple documents
- SEO Optimization Integration: Connect with SEO APIs for actionable insights
- Content Quality Assessment: Add readability scoring and writing quality analysis
- Multi-language Support: Expand analysis capabilities to multiple languages
- Interactive Web Interface: Create a user-friendly web interface
See Feature Enhancement Ideas for more details on upcoming features.
This software uses the following open source packages:
- spaCy - Industrial-strength Natural Language Processing
- pandas - Data analysis and manipulation tool
- Seaborn - Statistical data visualization
- Matplotlib - Comprehensive library for creating visualizations
- Loguru - Python logging made simple
- Tabulate - Pretty-print tabular data
- Watchdog - API and shell utilities to monitor file system events
github.com/jspenaq · LinkedIn @jspenaq