An educational/research platform demonstrating intelligent cybersecurity incident triage through NLP. Features synthetic dataset generation with LLM enhancement, uncertainty-aware classification, interactive Streamlit UI, and production-grade monitoring tools for SOC automation research.
Try AlertSage in your browser:
👉 https://alertsage.streamlit.app/
The public demo uses hosted Hugging Face inference with usage limits. Local GGUF / llama.cpp support is available for developers.
⚠️ Not production incident response tooling
This repo is for learning, experimentation, and portfolio use. It is not intended to run unsupervised in a live SOC.
- Docs site: published with MkDocs Material (see docs/index.md locally or
https://texasbe2trill.github.io/AlertSage/after deployment) - Getting started: docs/getting-started.md
- CLI usage: docs/cli.md
- Data & generator: docs/data-and-generator.md
- Development workflow: docs/development.md
- 🤖 LLM-Enhanced Dataset Generation with local llama.cpp models, intelligent rewriting, sanitization, and production monitoring scripts
- ⚡ GPU Acceleration: Automatic Metal/CUDA/Vulkan detection for ~10x faster LLM inference (6-7s vs 60s per incident on CPU)
- 🎯 Interactive Streamlit UI with real-time classification, bulk analysis, LLM second-opinion integration, and visual analytics
- 🚀 Smart LLM Mode: Two-pass optimization strategy for 60-80% faster batch processing (baseline analysis → selective LLM on uncertain cases)
- 📊 Synthetic SOC Dataset (100k incidents) with multi-perspective narratives, MITRE ATT&CK enrichment, and realistic label noise
- 🧠 Uncertainty-Aware Classification using TF-IDF + Logistic Regression with configurable thresholds and intelligent fallback handling
- 🔍 LLM Second-Opinion Engine for uncertain cases with JSON guardrails, SOC keyword validation, and hallucination prevention
- 📈 Production Monitoring Tools for dataset generation with progress tracking, ETA calculation, and resource efficiency metrics
- 🛠️ Rich CLI Experience with ASCII banners, probability tables, bulk processing, and JSON output modes
- 📚 Comprehensive Notebooks (01-10) covering exploration, training, interpretability, hybrid models, and operational decision support
- 🔬 Shared Preprocessing Pipeline ensuring training/inference alignment across CLI, UI, and notebooks
- 📦 Professional Packaging with pytest suite, MkDocs documentation, and CI/CD workflows
| Category | Feature | Description |
|---|---|---|
| 🎨 User Interface | Streamlit Web Application | Interactive triage with visual analytics and bulk processing |
| 🚀 Performance | Smart LLM Mode | Two-pass optimization: 60-80% faster than full LLM analysis |
| ⚡ GPU Acceleration | Automatic GPU Detection | Auto-enables Metal/CUDA/Vulkan for ~10x LLM speedup |
| 🤖 LLM Integration | Local llama.cpp Models | Privacy-first LLM for generation and second opinions |
| 📊 Dataset | Synthetic SOC Corpus (500k) | Realistic incidents with noise, typos, conflicting signals |
| 🔄 Rewrite Engine | LLM-Enhanced Generation | Intelligent rewriting with sanitization & caching |
| 🛠️ CLI Tools | nlp-triage Command | Rich-formatted CLI with uncertainty logic and JSON modes |
| 📈 Monitoring | Production Generation Scripts | Real-time progress, ETA, resource tracking, auto-refresh |
| 🧠 Modeling | TF-IDF + Logistic Regression | Reproducible baseline with uncertainty-aware predictions |
| 🔍 Second Opinion | LLM Triage Assistant | Guardrails against hallucinations, SOC keyword intelligence |
| 📚 Analysis | 10 Jupyter Notebooks | End-to-end workflow from exploration to decision support |
| 🎯 MITRE Mapping | ATT&CK Framework Integration | Technique enrichment and adversary behavior modeling |
| 📖 Documentation | MkDocs Material Site | Comprehensive docs with API reference and tutorials |
| ✅ Quality | CI/CD + Testing | Automated tests, GitHub Actions, professional packaging |
┌─────────────────────────────────────────────────────────────────────┐
│ SYNTHETIC DATA GENERATION │
│ ┌────────────────────┐ ┌──────────────────┐ │
│ │ Generator Scripts │───▶│ LLM Rewriter │ │
│ │ • MITRE enrichment │ │ • llama.cpp │ │
│ │ • Label noise │ │ • Sanitization │ │
│ │ • Realistic typos │ │ • Caching │ │
│ └────────────────────┘ └──────────────────┘ │
│ │ │ │
│ └─────────┬───────────────┘ │
│ ▼ │
│ ┌──────────────────────┐ │
│ │ cyber_incidents.csv │ │
│ │ (100k+ incidents) │ │
│ └──────────────────────┘ │
└─────────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────┐
│ PREPROCESSING PIPELINE │
│ ┌─────────────────────────────────────────┐ │
│ │ clean_description (shared module) │ │
│ │ • Unicode normalization │ │
│ │ • Lowercase + punctuation cleanup │ │
│ │ • TF-IDF vectorization │ │
│ └─────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────┐
│ BASELINE CLASSIFIER │
│ ┌─────────────────────────────┐ │
│ │ Logistic Regression Model │ │
│ │ • TF-IDF features │ │
│ │ • Class-balanced training │ │
│ │ • Uncertainty threshold │ │
│ └─────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────┘
│
┌─────────────┴─────────────┐
▼ ▼
┌──────────────────────────┐ ┌────────────────────────────┐
│ HIGH CONFIDENCE PATH │ │ UNCERTAIN PATH │
│ • Direct classification │ │ • Low confidence score │
│ • MITRE mapping │ │ • Ambiguous signals │
│ • SOC rationale │ └────────────┬───────────────┘
└──────────────────────────┘ │
▲ ▼
│ ┌────────────────────────────┐
│ │ LLM SECOND OPINION │
│ │ • JSON parsing guardrails │
│ │ • SOC keyword validation │
│ │ • Label normalization │
│ │ • Rationale generation │
│ └────────────┬───────────────┘
│ │
└───────────┬───────────────┘
▼
┌──────────────────────────────────────────┐
│ INTERFACES & OUTPUT │
├──────────────────┬───────────────────────┤
│ Rich CLI │ Streamlit UI │
│ • Interactive │ • Visual analytics │
│ • Bulk mode │ • Bulk upload │
│ • JSON output │ • LLM integration │
└──────────────────┴───────────────────────┘
Get AlertSage running in 5 minutes.
- Python 3.11+ (
python --version) - Git (
git --version)
# Clone
git clone https://github.com/texasbe2trill/AlertSage.git
cd AlertSage
# Create venv
# macOS/Linux
python3.11 -m venv .venv && source .venv/bin/activate
# Windows
# python -m venv .venv && .venv\Scripts\activate
# Install dev extras (editable)
pip install -e ".[dev]"# Run tests (first run auto-downloads model artifacts ~10 MB)
pytest
# Optional: coverage (readable report)
pytest --cov=src/triage --cov-report=term-missingstreamlit run ui_premium.py
# If 8501 is busy:
streamlit run ui_premium.py --server.port 8502# Basic classification
nlp-triage "User reported suspicious email with attachment"
# JSON output for scripting
nlp-triage --json "Multiple failed login attempts detected"
# Bulk processing with LLM assistance
nlp-triage --llm-second-opinion \
--input-file data/incidents.txt \
--output-file results.jsonl
# High-difficulty mode for ambiguous cases
nlp-triage --difficulty soc-hard "Website experiencing slowdowns"
Prefer not to download a GGUF model? Use Hugging Face Inference instead:
For deployment, the Streamlit demo uses hosted Hugging Face inference, while local GGUF/llama.cpp is supported via pip install -e ".[dev]".
LLM Assist Demo: the public Streamlit demo will auto-use Hugging Face hosted inference when HF_TOKEN (and optional HF_MODEL) are configured. No local llama.cpp install is required for demo users.
The demo now uses the Hugging Face Router endpoint with chat completions:
https://router.huggingface.co/v1/chat/completions. Models may require provider
suffixes (for example meta-llama/Llama-3.1-8B-Instruct:cerebras).
- Export
HF_TOKEN(orTRIAGE_HF_TOKEN) with your personal token. - Optional: set
TRIAGE_HF_MODEL/HF_MODEL(defaultmeta-llama/Llama-3.1-8B-Instruct:cerebras). - The CLI automatically routes LLM second-opinion calls to Hugging Face when a token is present—no local model required.
- In the Streamlit UI, enable LLM Enhancement, choose Hugging Face Inference, and supply your token in the sidebar. The UI enforces a 5-requests/60s session limit, an 8k-character input cap, and a 512 max-token generation limit per request.
Example Streamlit secrets:
HF_TOKEN="your_token"
HF_MODEL="meta-llama/Llama-3.1-8B-Instruct:cerebras"
Tokens are never hardcoded; keep them in environment variables or a local secrets manager.
LLM-assisted features use local llama.cpp models in GGUF format. Download a model and point the app to it:
# 1) Create a models directory
mkdir -p models
# 2) Install llama-cpp-python with GPU acceleration (macOS Metal)
CMAKE_ARGS="-DLLAMA_METAL=on" pip install llama-cpp-python
# Alternative: CPU-only
# pip install llama-cpp-python
# 3) Download a GGUF model via Hugging Face (choose ONE)
# Option A: Llama 3.1 8B Instruct (quality, larger)
huggingface-cli download TheBloke/Llama-3.1-8B-Instruct-GGUF \
Llama-3.1-8B-Instruct-Q6_K.gguf --local-dir models
# Option B: Mistral 7B Instruct v0.2 (solid, mid-size)
huggingface-cli download TheBloke/Mistral-7B-Instruct-v0.2-GGUF \
mistral-7b-instruct-v0.2.Q6_K.gguf --local-dir models
# Option C: TinyLlama 1.1B Chat (very small, CPU-friendly)
huggingface-cli download TinyLlama/TinyLlama-1.1B-Chat-v1.0-GGUF \
TinyLlama-1.1B-Chat-v1.0.Q6_K.gguf --local-dir models
# 4) Point AlertSage at your model path
export TRIAGE_LLM_MODEL="$(pwd)/models/Llama-3.1-8B-Instruct-Q6_K.gguf" # or whichever you downloaded
# 5) Test CLI LLM second opinion
nlp-triage --llm-second-opinion "Server began encrypting shared folders."
# Optional: Streamlit UI with LLM enabled (toggle in sidebar)
streamlit run ui_premium.py
# Debug/disable GPU
export TRIAGE_LLM_DEBUG=1
export LLAMA_N_GPU_LAYERS=0 # force CPU if neededNotes:
- The default fallback path is models/Meta-Llama-3.1-8B-Instruct-Q6_K.gguf; using
TRIAGE_LLM_MODELis preferred. - Some models (e.g., Llama 3.x) require license acceptance on Hugging Face; authenticate before download.
- GPU backends are auto-enabled (Metal/CUDA) when available; see docs for tuning.
### 5) Docs
```bash
mkdocs serve
- Canonical CSV is distributed as a compressed release asset: GitHub Releases.
- Tests and notebooks auto-download artifacts on first run.
- To generate locally:
# Quick generation (1000 incidents)
python generator/generate_cyber_incidents.py --n-events 1000For larger runs and monitoring, see docs/data-and-generator.md.
- 🏠 Home: Docs Site
- 🚀 Getting Started: docs/getting-started.md
- 💻 CLI Usage: docs/cli.md
- 🎨 Streamlit UI Guide: docs/ui-guide.md
- 📊 Data & Generator: docs/data-and-generator.md
- 🏭 Production Generation: docs/production-generation.md
- 🔬 Development Workflow: docs/development.md
- 📚 Notebooks: docs/notebooks.md
- 🎯 Modeling & Eval: docs/modeling-and-eval.md
The project includes professional-grade bash orchestration scripts for large-scale dataset generation with LLM enhancement and real-time monitoring:
# Generate 100k incidents with default settings (LLM enhancement enabled)
./generator/launch_generator.sh
# Custom size and name
./generator/launch_generator.sh 50000 training_data
# Fresh start (delete existing files)
./generator/launch_generator.sh 100000 my_dataset --freshFeatures:
- 🤖 LLM Enhancement: Optional Llama-2-13B-Chat for realistic narrative rewrites (1% of events by default)
- 💾 Checkpointing: Automatic progress saving every 100 events - resume interrupted generations seamlessly
- 🔄 Background Processing: Uses
nohupfor SSH-safe, unattended operation (6-9 hours for 100K events) - ⚙️ Environment Configuration: Automatically sets LLM model paths, rewrite probability, temperature
# Single snapshot
./generator/monitor_generation.sh
# Auto-refresh every 30 seconds (recommended)
./generator/monitor_generation.sh --watch
# Custom refresh interval (10 seconds)
./generator/monitor_generation.sh --watch 10
# Simple mode for SSH/tmux
./generator/monitor_generation.sh --simple-color --watchDashboard features:
- 📊 Real-time progress tracking with visual progress bar and ETA
- 💻 CPU/memory usage with per-core utilization
- 🎮 GPU Acceleration Metrics (Apple Silicon): Metal detection, LLM throughput (~18.5 tokens/sec)
- 📈 Throughput trend analysis (accelerating/declining/steady)
- ⚡ Resource efficiency (events/CPU%, events/GB RAM)
- 🔄 Chunk timing and interval analysis
- 📝 Last 5 log entries tail
- 🎯 Quick action commands (kill process, check progress, view logs)
Performance:
- GPU-accelerated (auto-enabled): ~10x faster LLM inference (6-7s per rewrite vs 60s on CPU)
- With LLM (1% rewrite, GPU): ~3-5 events/sec (100K in 6-9 hours)
- Without LLM: ~50-100 events/sec (100K in 20-30 minutes)
GPU Acceleration:
All LLM features automatically detect and enable GPU acceleration (Metal for Apple Silicon, CUDA for NVIDIA, Vulkan fallback):
- ✅ No configuration required - auto-enabled in CLI, UI, and generator
- ✅ ~10x speedup: 6-7 seconds per incident vs 60+ seconds on CPU
- ✅ Debug mode shows GPU configuration:
export TRIAGE_LLM_DEBUG=1 - ✅ Disable if needed:
export LLAMA_N_GPU_LAYERS=0
See docs/production-generation.md for comprehensive guide including troubleshooting, performance optimization, and advanced workflows.
The interactive web interface (ui_premium.py) provides a comprehensive triage experience:
- Real-time classification with confidence visualization
- LLM second-opinion toggle for uncertain cases
- MITRE ATT&CK technique mapping
- Kill chain phase analysis
- Interactive probability distributions
- SOC triage recommendations
- CSV file upload and processing
- Smart LLM Mode (⚡ 60-80% faster): Two-pass strategy that runs baseline analysis on all incidents, then applies expensive LLM inference only to uncertain cases (confidence < threshold)
- Session state caching: Results persist across downloads and UI interactions
- Batch LLM enhancement for uncertain predictions
- Comprehensive threat intelligence briefs
- Advanced analytics dashboard
- Export results to CSV/JSON/Markdown
- Interactive filtering and exploration
- Label distribution charts
- Confidence heatmaps
- MITRE technique coverage
- Time-based analysis
- Risk scoring
- Uncertainty analysis
nlp-triage is exposed via pyproject.toml entry points. Key capabilities:
- Shared text cleaning and TF–IDF vectorization.
- Uncertainty-aware predictions with configurable threshold (
--threshold, default 0.50) anduncertainfallback. - Rich formatting: ASCII banner, cleaned-text panel, probability tables, progress bar.
- JSON output (
--json) for scripts and tests. - Interactive loop when called without a positional argument.
- SOC Difficulty Mode: creates intentionally ambiguous test cases (
nlp-triage --difficulty soc-hard) for scenario-based evaluation. - Optional LLM second-opinion mode (
--llm-second-opinion) with JSON parsing hardening, SOC keyword intelligence, and deterministic rationale generation.
Help menu:
nlp-triage -husage: nlp-triage [-h] [--json] [--threshold THRESHOLD] [--max-classes MAX_CLASSES] [--difficulty {default,soc-medium,soc-hard}]
[--input-file INPUT_FILE] [--output-file OUTPUT_FILE] [--llm-second-opinion]
[text]
Cybersecurity Incident NLP Triage CLI
positional arguments:
text Incident description
options:
-h, --help show this help message and exit
--json Return raw JSON output instead of formatted text
--threshold THRESHOLD
Uncertainty threshold (default=0.5)
--max-classes MAX_CLASSES
Maximum number of classes to display in the probability table
--difficulty {default,soc-medium,soc-hard}
Difficulty / strictness mode for uncertainty handling. Use 'soc-hard' to mark more cases as 'uncertain'.
--input-file INPUT_FILE
Optional path to a text file for bulk mode; each non-empty line is treated as an incident description.
--output-file OUTPUT_FILE
Optional path to write JSONL predictions for bulk mode. Each line will contain one JSON object.
--llm-second-opinion If set, call a local LLM (e.g., Llama-2-7B-GGUF via llama-cpp-python) to provide a second opinion when the baseline model is
uncertain.
nlp-triage "User reported a suspicious popup on FINANCE-WS-07."Outputs include:
- Cleaned text
- Probability table
- Final label (or
uncertain) - MITRE ATT&CK® mapping
- Deterministic SOC rationale
Machine-readable output:
nlp-triage --json "Large data transfer from laptop to Dropbox."Process multiple incidents from a file:
nlp-triage --input-file data/incidents.txtWrite JSONL output:
nlp-triage --input-file data/incidents.txt --output-file data/results.jsonl# More uncertain
nlp-triage --threshold 0.70 "Website slowdowns after hours."
# Less uncertain
nlp-triage --threshold 0.30 "Single phishing email blocked."Limit displayed classes:
nlp-triage --max-classes 3 "Multiple failed logins."nlp-triage --difficulty default "Employee copied files to USB."
nlp-triage --difficulty soc-medium "Odd login patterns during travel."
nlp-triage --difficulty soc-hard "Website slow after patching."Enable LLM review on uncertain predictions:
nlp-triage --llm-second-opinion "Server began encrypting shared folders."Bulk mode:
nlp-triage --llm-second-opinion \
--input-file data/incidents.txt \
--output-file data/triage_llm.jsonlDebug logs (includes GPU configuration):
export TRIAGE_LLM_DEBUG=1
nlp-triage --llm-second-opinion --input-file data/incidents.txtGPU acceleration is auto-enabled for ~10x speedup. Disable if needed:
export LLAMA_N_GPU_LAYERS=0 # Force CPU-only mode
nlp-triage --llm-second-opinion "Server encrypting files."One-off with pretty output:
nlp-triage "Multiple VPN credential failures followed by a successful login."Bulk triage with JSONL output:
nlp-triage \
--difficulty default \
--threshold 0.50 \
--input-file data/incidents.txt \
--output-file data/results.jsonlSOC-hard + LLM second opinion:
nlp-triage \
--difficulty soc-hard \
--llm-second-opinion \
--input-file data/incidents.txt \
--output-file data/results_soc_hard_llm.jsonlScript-friendly JSON mode:
nlp-triage --json --threshold 0.6 \
"Unusual data transfer to unfamiliar cloud provider from FINANCE-WS-07."- Auto-download: Dataset automatically downloads when running tests or notebooks if not present
- Location:
data/cyber_incidents_simulated.csv(~107MB, excluded from git) - Size: 100k incidents by default, configurable up to millions
- Rich Schema: Multiple narrative perspectives (
description,description_short,description_user_report,short_log)
- MITRE ATT&CK Integration: Realistic adversary behavior descriptions
- Label Noise: Configurable confusion between similar threat types
- LLM Enhancement: Optional rewriting with local llama.cpp models
- Typos & Abbreviations: Realistic SOC analyst writing patterns
- Checkpointing: Resume interrupted generations
- Monitoring: Real-time progress tracking and ETA calculation
# Basic generation (1000 incidents)
python generator/generate_cyber_incidents.py --n-events 1000
# With LLM enhancement
python generator/generate_cyber_incidents.py \
--n-events 5000 \
--use-llm \
--rewrite-report audit.json
# Production mode with monitoring
./generator/launch_generator.sh 50000 my_dataset
./generator/monitor_generation.sh my_dataset --watchSee docs/data-and-generator.md for schema details, customization options, and advanced configuration.
- Text Representation: TF-IDF (unigrams + bigrams, ~5k features, stopword removal)
- Baseline Classifier: Logistic Regression with class balancing (
max_iter=2000) - Uncertainty Handling: Configurable confidence threshold with
uncertainfallback - Additional Experiments: Linear SVM, Random Forest, ensemble methods
models/
├── vectorizer.joblib # TF-IDF vectorizer
├── baseline_logreg.joblib # Trained classifier
├── X_train_tfidf.joblib # Cached train features
├── X_test_tfidf.joblib # Cached test features
├── y_train.joblib # Train labels
└── y_test.joblib # Test labels
The repository includes 10 comprehensive Jupyter notebooks covering the complete ML pipeline from exploration to production-ready hybrid models:
- 01_explore_dataset.ipynb - Dataset exploration & quality assessment (EDA with enhanced visualizations)
- 02_prepare_text_and_features.ipynb - Feature engineering & text preprocessing (TF-IDF pipeline)
- 03_baseline_model.ipynb - Logistic Regression baseline training (92-95% accuracy, dual confusion matrices)
- 04_model_interpretability.ipynb - Feature importance & coefficient analysis (model explainability)
- 05_inference_and_cli.ipynb - Prediction workflow & CLI testing (with Ctrl+C handling)
- 06_model_visualization_and_insights.ipynb - Performance analysis & comprehensive interpretation
- 07_scenario_based_evaluation.ipynb - Edge case & scenario testing (adversarial examples)
- 08_model_comparison.ipynb - Multi-model benchmarking (LogReg, SVM, Random Forest + comparative heatmaps)
- 09_operational_decision_support.ipynb - Uncertainty & threshold analysis (SOC workflow integration)
- 10_hybrid_model.ipynb - Text + metadata feature fusion (ColumnTransformer, 4-model comparison) NEW
All notebooks enhanced with:
- ✅ Professional visualizations (custom colormaps, dual confusion matrices, grouped bar charts)
- ✅ Comprehensive markdown analysis (results interpretation, deployment readiness, future enhancements)
- ✅ Bug fixes (alignment, duplicate output, parameter errors, preprocessor outputs)
- ✅ Code quality improvements (consistent styling, reproducible seeds, modular cells)
See docs/notebooks.md for detailed features, learning outcomes, and troubleshooting guide.
This project models the ambiguity and uncertainty inherent in real-world SOC (Security Operations Center) operations. Rather than producing "perfect" predictions, the system exhibits realistic behaviors that mirror how human analysts triage incidents under time pressure and incomplete information.
✅ Confident on Clear Cases
- Obvious malware infections
- Textbook phishing attempts
- Direct web attacks
- Clear policy violations
- Impossible travel vs. legitimate VPN/sync behavior
- Large file transfers vs. normal business workflows
- Website slowdowns (DDoS vs. maintenance vs. load)
- Credential reuse vs. compromise vs. password manager
🧠 Human-Like Decision Patterns
- Probabilistic reasoning with confidence scores
uncertainfallback when evidence is ambiguous- Multiple plausible explanations for single symptoms
- Scenario-driven behavior matching SOC reality
The system correctly identifies high-confidence threats while marking ambiguous cases as "uncertain" for human review
Difficulty Mode Demonstrations

📸 More screenshots available in docs/images/
📖 Full CLI documentation: docs/cli.md
- Synthetic Training Only: Model trained exclusively on generated data; real-world accuracy unknown
- No Live Integrations: No SIEM/EDR/SOAR connections; designed for research and learning
- Edge Case Sensitivity: May misclassify novel or unusual incident patterns
- Confidence Calibration: Probability scores are relative, not absolute measures of certainty
- Decision Support Only: Outputs should inform, not replace, human analyst judgment
- Not Production-Ready: Requires evaluation on real data before operational deployment
- Privacy Awareness: LLM features use local models; no data sent to external APIs
- Resource Requirements: LLM generation/inference requires sufficient CPU/RAM
✅ Educational exploration of NLP in cybersecurity
✅ Research prototypes and experimentation
✅ Portfolio demonstrations and skill development
✅ Testing SOC automation concepts
❌ Unsupervised production incident response
❌ Replacing trained security analysts
❌ Critical decision-making without human oversight
More details: docs/limitations.md
This project references MITRE ATT&CK® techniques (T1078, T1190, T1486, T1566, etc.) and incorporates paraphrased descriptions from the public ATT&CK knowledge base for educational and research purposes.
MITRE ATT&CK® and ATT&CK® are registered trademarks of The MITRE Corporation.
ATT&CK data provided by The MITRE Corporation is licensed under:
Creative Commons Attribution-ShareAlike 4.0 International License
🔗 Official source: attack.mitre.org
This project models the ambiguity and noise found in real SOC (Security Operations Center) investigations.
Rather than producing “perfect” predictions, the classifier shows realistic uncertainties, overlaps, and
borderline classifications that resemble how analysts triage incidents under time pressure.
- Correct on clear-cut cases (malware, phishing, web attacks, policy violations)
- Ambiguity on edge cases, such as:
- Impossible travel vs benign sync behavior
- Large file transfers vs normal business workflow
- Website slowdown vs DDoS vs system maintenance
- Human-like uncertainty using an
uncertainfallback when confidence is too low - Scenario-driven behavior, matching SOC reality where multiple causes can explain one symptom
Example prediction from the CLI:
Screenshots live in docs/images/. Full breakdown: docs/cli.md.
- Synthetic-only training – accuracy on real SOC data is unknown.
- Edge-case narratives may be misclassified or produce low confidence.
- No live integrations (SIEM/EDR/SOAR); CLI is decision-support only.
- Treat outputs as advisory; pair with human analyst review.
More context: docs/limitations.md.
This project references MITRE ATT&CK® techniques (for example, IDs such as T1078, T1190, T1486, and T1566) and paraphrases portions of public ATT&CK technique descriptions for educational and research purposes.
MITRE ATT&CK® and ATT&CK® are registered trademarks of The MITRE Corporation.
ATT&CK data is provided by The MITRE Corporation and is licensed under the Creative Commons Attribution-ShareAlike 4.0 International License.
pip install mkdocs-material
mkdocs serveVisit http://127.0.0.1:8000 to preview the documentation site locally.
mkdocs gh-deploy --forceAutomated deployment via .github/workflows/docs.yml runs on pushes to main and can be triggered manually via Run workflow in the Actions tab.
# Run all tests
pytest tests/ -v
# Run specific test file
pytest tests/test_cli.py -v
# With coverage
pytest tests/ --cov=src/triage --cov-report=html# Format code
black src/ tests/
# Type checking (if configured)
mypy src/
# Lint
flake8 src/ tests/- Notebook 10: Hybrid model with text + metadata features ✅ COMPLETED
- Transformer Models: BERT/RoBERTa variants for improved accuracy
- Deep Learning Pipeline: Fine-tuning with MITRE-enriched text
- Active Learning: Interactive labeling interface for model improvement
- Multi-language Support: Non-English incident descriptions
- API Service: REST API for programmatic access
- Docker Deployment: Containerized setup with docker-compose
- Expanded CI/CD: More linters, security scanning, automated releases
- Performance Optimization: Caching, batch processing improvements
- Real-world Evaluation: Testing on publicly available SOC datasets
- Video Tutorials: Setup and usage walkthroughs
- Blog Posts: Deep dives into specific features
- Contribution Guide: Detailed guidelines for contributors
- Example Integrations: Sample SIEM/SOAR connectors
Contributions Welcome! Issues, pull requests, and feedback are encouraged.
Licensed under the Apache License 2.0. See LICENSE for the full text and NOTICE for attribution.
- Primary authorship: Chris Campbell and contributors.
- Retain the project name “AlertSage” and include the NOTICE file in redistributed builds.
- Keep third-party licenses and notices with any vendored assets or code.
- MITRE Corporation for the ATT&CK® framework
- Streamlit for the excellent UI framework
- scikit-learn for machine learning tools
- llama.cpp for local LLM inference
- Open-source cybersecurity community for inspiration and knowledge sharing
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Documentation: https://texasbe2trill.github.io/AlertSage/
- Security: See SECURITY.md for reporting vulnerabilities
⭐ If you find this project useful, please consider giving it a star!


