Status: MVP Complete ✅ Version: 0.1.0 License: MIT
A desktop application for analyzing homicide data from the Murder Accountability Project (MAP), featuring custom clustering algorithms, advanced filtering, comprehensive testing, and a distinctive forensic-inspired interface.
This Electron + React + Python application enables researchers, journalists, and analysts to explore and analyze 894,636 homicide records spanning 1976-2023, identifying suspicious clusters of unsolved murders that may indicate serial killer activity.
MVP is complete with production-ready code, comprehensive test coverage (90%+ backend, 88% frontend), error handling, and complete documentation. Additional features (Map, Timeline, Statistics visualizations) are in active development.
Frontend:
- Electron 28 - Desktop application framework
- React 18 + TypeScript - UI framework
- Zustand - UI state management
- TanStack Query - Server state & caching
- TanStack Table - Data tables with virtualization
- Vite - Build tool
- Leaflet + React-Leaflet - Map visualization (Phase 2)
- Recharts - Statistical charts
Backend:
- Python 3.11 + FastAPI - REST API
- Pandas 2.1 + NumPy 1.26 - Data processing
- scikit-learn 1.3 - Clustering algorithms
- SQLite - Local database
- Uvicorn - ASGI server
redstring/
├── electron/ # Electron main process (window mgmt, Python bridge)
├── src/ # React renderer process (frontend UI)
│ ├── components/ # UI components
│ ├── stores/ # Zustand state stores
│ ├── services/ # API client layer
│ └── hooks/ # Custom React hooks
├── backend/ # Python FastAPI backend
│ ├── analysis/ # Clustering & similarity algorithms
│ ├── database/ # SQLite schema & queries
│ ├── routes/ # API endpoints
│ └── services/ # Business logic
├── resources/ # Bundled data files
│ ├── data/ # CSV datasets (tracked via Git LFS)
│ └── docs/ # PDF documentation
└── tests/ # Test suites (frontend + backend)
- Node.js: 18.x or higher
- Python: 3.11 or higher
- Git LFS: For large data files
git clone https://github.com/yourusername/redstring.git
cd redstringnpm installcd backend
python3 -m venv venv
# macOS/Linux:
source venv/bin/activate
# Windows:
venv\Scripts\activate
pip install -r requirements.txt
pip install -r requirements-dev.txt
cd ..npm run devThis starts both the Python backend (port 5000) and Electron frontend concurrently.
Or run separately:
# Terminal 1: Backend
cd backend && source venv/bin/activate
uvicorn main:app --reload --port 5000
# Terminal 2: Frontend
npm run dev:electron# Lint frontend code
npm run lint
npm run lint:fix
# Format code
npm run format
# Format Python code (in backend/)
black .
isort .# Run all tests (frontend + backend)
npm test
# Frontend tests only (Vitest + React Testing Library)
npm run test:frontend
# With coverage:
npm run test:frontend -- --coverage
# Backend tests only (pytest)
npm run test:backend
# With coverage:
cd backend && source venv/bin/activate
pytest tests/backend/ -v --cov=. --cov-report=htmlTest Coverage:
- Backend: 150+ tests across 6 test files (90-95% coverage)
- Frontend: 182 tests across 10 test files (88% coverage)
npm run build# macOS DMG
npm run package:mac
# Windows MSI/NSIS
npm run package:win
# Linux AppImage/deb
npm run package:linuxAll data files are stored in resources/data/ and tracked via Git LFS:
- Murder Data SHR65 2023.csv (312MB) - Primary dataset, 894,636 homicide records
- UCR 2023 Data.csv (12MB) - Agency-level statistics
- State FIPS Lookout.csv - State FIPS code mapping
- County FIPS Lookout.csv - County FIPS code mapping
- US County Centroids.csv - Geographic coordinates for counties
✅ Data Pipeline
- Import 894,636 records into local SQLite database (<60 seconds)
- Transform and enrich data with FIPS codes and geographic coordinates
- First-run setup with real-time progress indicator
- Automated data validation and error handling
✅ Advanced Filtering (14 Filter Types)
- Primary filters: Case status, year range (1976-2023), 51 states
- Demographics: Victim/offender age, sex, race, ethnicity
- Crime details: 18 weapon types, 28 relationship types, circumstances, situations
- Geographic: County and MSA filtering
- Search: Case ID exact match, agency name substring search
- Auto-apply with 300ms debounce for performance
✅ Custom Clustering Algorithm
- County-based geographic clustering
- Multi-factor similarity scoring with 6 weighted factors:
- Geographic proximity (35%)
- Weapon match (25%)
- Victim sex (20%)
- Victim age (10%)
- Temporal proximity (7%)
- Victim race (3%)
- Dataset tier system: Tier 1 (<10k instant), Tier 2 (10-50k with estimate), Tier 3 (>50k requires filtering)
- Configurable detection thresholds (min cluster size, max solve rate)
- Connected component detection using DFS
- Note: Cluster UI currently shows "Coming Soon" placeholder while being refined
✅ Case Similarity ("Find Similar")
- 7-factor weighted similarity scoring:
- Weapon type (30%)
- Geographic proximity (25%)
- Victim age (20%)
- Temporal proximity (15%)
- Victim race (5%)
- Circumstance (3%)
- Relationship (2%)
- Haversine distance calculation for geographic scoring
- Same victim sex matching as baseline filter
- Factor breakdown display in UI
✅ Export Capabilities
- Export filtered case results to CSV
- Export cluster analysis results with all case details
- Proper CSV escaping for all fields
✅ Forensic Minimalism Design
- Lab Mode (Light): Clean, clinical, high-contrast for daylight analysis
- Evidence Room (Dark): Low-light, focused environment for night work
- Distinctive theme toggle (no generic sun/moon icons)
- IBM Plex Mono typography for data-heavy content
- Smooth transitions and accessibility support
✅ Virtualized Data Tables
- TanStack Table with TanStack Virtual for 50k+ row support
- Infinite scroll with auto-fetch
- Smooth 60fps scrolling performance
- Row selection and sorting
✅ Comprehensive Error Handling
- React ErrorBoundary with user-friendly fallback UI
- Smart retry logic with exponential backoff
- User-friendly error messages (no technical jargon)
- Detailed logging to rotating log files
✅ Comprehensive Testing
- Backend: 150+ tests, 90-95% coverage (pytest)
- Database schema and query tests
- Data transformation and loading tests
- Clustering algorithm tests
- API endpoint tests (cases, clusters, setup)
- Frontend: 182 tests, 88% passing (Vitest + React Testing Library)
- Component tests (filters, tables, modals)
- Store tests (Zustand state management)
- Hook tests (TanStack Query)
- Utility tests (CSV export, error handling)
✅ Code Quality Automation
- Pre-commit hooks with husky + lint-staged
- Auto-fix TypeScript/React with ESLint + Prettier
- Auto-format Python with Black + isort
- TypeScript strict mode enabled
- Zero compilation errors
✅ Complete Documentation
- DEVELOPMENT.md: Complete developer guide with setup, architecture, testing, and contributing guidelines
- API.md: Full REST API reference with request/response examples
🚧 Map Visualization
- Interactive geographic visualization using Leaflet + React-Leaflet
- County aggregation with choropleth layers
- Case markers with clustering
- Color coding by solve rate, case count, or other metrics
- Backend:
/api/map/*endpoints implemented - Frontend: Components in
src/components/map/
🚧 Timeline Visualization
- Temporal analysis of cases over time using Recharts
- Year/month/decade aggregation
- Trend analysis with moving averages
- Backend:
/api/timeline/*endpoints implemented - Frontend: Components in
src/components/timeline/
🚧 Statistics Dashboard
- Comprehensive metrics and charts
- Demographic breakdowns (victim sex, race, age)
- Weapon distribution analysis
- Seasonal patterns
- Geographic distribution
- Backend:
/api/statistics/*endpoints implemented - Frontend: Components in
src/components/statistics/
🔧 Cluster UI Refinement
- Currently showing "Coming Soon" placeholder
- Algorithm implemented and working
- UI being refined for better user experience
| Operation | Target | Status |
|---|---|---|
| Database setup (894,636 records) | < 60 seconds | ✅ Met |
| Single filter query | < 500ms | ✅ Met |
| Multi-filter query (3-5 filters) | < 2 seconds | ✅ Met |
| Cluster analysis | < 5 seconds | ✅ Met |
| Table rendering (50k+ rows) | Smooth 60fps | ✅ Met |
| Map aggregation | < 2 seconds | 🚧 Phase 2 |
| Timeline aggregation | < 1 second | 🚧 Phase 2 |
| Statistics dashboard | < 2 seconds | 🚧 Phase 2 |
- ✅ Frontend test coverage: 88% (target: 80%+)
- ✅ Backend test coverage: 90-95% (target: 90%+)
- ✅ TypeScript strict mode enabled
- ✅ All code linted and formatted
- ✅ Zero ESLint/TypeScript errors
- ✅ Pre-commit hooks enforcing quality
Comprehensive documentation is available:
| Document | Description |
|---|---|
| CLAUDE.md | Quick reference for AI assistants and project overview |
| docs/DEVELOPMENT.md | Complete developer guide (setup, architecture, testing) |
| docs/API.md | Full REST API reference with examples |
| redstring PRD.md | Complete product requirements document |
✅ MVP (COMPLETE)
- Foundation and project setup
- Electron + Python bridge
- Database and data pipeline
- API and frontend
- Clustering algorithm with tier system
- Case similarity "Find Similar" feature
- Testing, theming, error handling, and documentation
🚧 Visualization Features (IN PROGRESS)
- Map visualization with Leaflet
- Timeline visualization with Recharts
- Statistics dashboard
- Cluster UI refinement
🔮 Future Features
- Radius-based clustering
- Custom weight configuration UI
- Saved analyses
The backend provides the following API endpoint groups:
| Endpoint Group | Description | Status |
|---|---|---|
/health |
Backend health check | ✅ Complete |
/api/setup/* |
Database initialization | ✅ Complete |
/api/cases/* |
Case queries and details | ✅ Complete |
/api/clusters/* |
Cluster analysis (with preflight) | ✅ Complete |
/api/similarity/* |
Case similarity search | ✅ Complete |
/api/map/* |
Map aggregation data | 🚧 In Progress |
/api/timeline/* |
Timeline aggregation | 🚧 In Progress |
/api/statistics/* |
Statistics dashboard | 🚧 In Progress |
See docs/API.md for complete API documentation.
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Make your changes
- Run tests (
npm test) - Commit with conventional commits (
git commit -m 'feat: add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
See docs/DEVELOPMENT.md for detailed contributing guidelines.
MIT
Data provided by the Murder Accountability Project.
Built with:
- Electron 28 + React 18 + TypeScript
- Python 3.11 + FastAPI + SQLite
- TanStack Query + Zustand + TanStack Table
- Leaflet + React-Leaflet (maps)
- Recharts (charts)
- IBM Plex Mono typography
- Forensic Minimalism design aesthetic