A modern Python 3.13+ microservices platform for transforming the complete Discogs music database into powerful, queryable knowledge graphs and analytics engines.
π Quick Start | π Documentation | π― Features | π¬ Community | π Emoji Guide
Discogsography transforms monthly Discogs data dumps (50GB+ compressed XML) into:
- π Neo4j Graph Database: Navigate complex music industry relationships
- π PostgreSQL Database: High-performance queries and full-text search
- π€ AI Discovery Engine: Intelligent recommendations and analytics
- π Real-time Dashboard: Monitor system health and processing metrics
Perfect for music researchers, data scientists, developers, and music enthusiasts who want to explore the world's largest music database.
| Service | Purpose | Key Technologies |
|---|---|---|
| π₯ Python Extractor | Downloads & processes Discogs XML dumps (Python) | asyncio, orjson, aio-pika |
| β‘ Rust Extractor | High-performance Rust-based extractor | tokio, quick-xml, lapin |
| π Graphinator | Builds Neo4j knowledge graphs | neo4j-driver, graph algorithms |
| π Tableinator | Creates PostgreSQL analytics tables | psycopg3, JSONB, full-text search |
| π΅ Discovery | AI-powered music intelligence | sentence-transformers, plotly, networkx |
| π Dashboard | Real-time system monitoring | FastAPI, WebSocket, reactive UI |
graph TD
S3[("π Discogs S3<br/>Monthly Data Dumps<br/>~50GB XML")]
PYEXT[["π₯ Python Extractor<br/>XML β JSON<br/>Deduplication"]]
RSEXT[["β‘ Rust Extractor<br/>High-Performance<br/>XML Processing"]]
RMQ{{"π° RabbitMQ<br/>Message Broker<br/>4 Queues"}}
NEO4J[("π Neo4j<br/>Graph Database<br/>Relationships")]
PG[("π PostgreSQL<br/>Analytics DB<br/>Full-text Search")]
GRAPH[["π Graphinator<br/>Graph Builder"]]
TABLE[["π Tableinator<br/>Table Builder"]]
DASH[["π Dashboard<br/>Real-time Monitor<br/>WebSocket"]]
DISCO[["π΅ Discovery<br/>AI Engine<br/>ML Models"]]
S3 -->|1a. Download & Parse| PYEXT
S3 -->|1b. Download & Parse| RSEXT
PYEXT -->|2. Publish Messages| RMQ
RSEXT -->|2. Publish Messages| RMQ
RMQ -->|3a. Artists/Labels/Releases/Masters| GRAPH
RMQ -->|3b. Artists/Labels/Releases/Masters| TABLE
GRAPH -->|4a. Build Graph| NEO4J
TABLE -->|4b. Store Data| PG
DISCO -.->|Query| NEO4J
DISCO -.->|Query| PG
DISCO -.->|Analyze| DISCO
DASH -.->|Monitor| PYEXT
DASH -.->|Monitor| RSEXT
DASH -.->|Monitor| GRAPH
DASH -.->|Monitor| TABLE
DASH -.->|Monitor| DISCO
DASH -.->|Stats| RMQ
DASH -.->|Stats| NEO4J
DASH -.->|Stats| PG
style S3 fill:#e1f5fe,stroke:#01579b,stroke-width:2px
style PYEXT fill:#fff9c4,stroke:#f57c00,stroke-width:2px
style RSEXT fill:#ffccbc,stroke:#d84315,stroke-width:2px
style RMQ fill:#fff3e0,stroke:#e65100,stroke-width:2px
style NEO4J fill:#f3e5f5,stroke:#4a148c,stroke-width:2px
style PG fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px
style DASH fill:#fce4ec,stroke:#880e4f,stroke-width:2px
style DISCO fill:#e3f2fd,stroke:#0d47a1,stroke-width:2px
- β‘ High-Speed Processing: 5,000-10,000 records/second XML parsing
- π Smart Deduplication: SHA256 hash-based change detection prevents reprocessing
- π Handles Big Data: Processes 15M+ releases, 2M+ artists efficiently
- π― Concurrent Processing: Multi-threaded parsing with async message handling
- π Auto-Recovery: Automatic retries with exponential backoff
- πΎ Message Durability: RabbitMQ persistence with dead letter queues
- π₯ Health Monitoring: HTTP health checks for all services
- π Real-time Metrics: WebSocket dashboard with live updates
- π Container Security: Non-root users, read-only filesystems, dropped capabilities
- π Code Security: Bandit scanning, secure defaults, parameterized queries
- π Type Safety: Full type hints with strict mypy validation
- β Comprehensive Testing: Unit, integration, and E2E tests with Playwright
- π§ ML-Powered Discovery: Semantic search using sentence transformers
- π Industry Analytics: Genre trends, label insights, market analysis
- π Graph Algorithms: PageRank, community detection, path finding
- π¨ Interactive Visualizations: Plotly charts, vis.js network graphs
| Document | Purpose |
|---|---|
| CLAUDE.md | π€ Claude Code integration guide & development standards |
| Documentation Index | π Complete documentation directory with all guides |
| GitHub Actions Guide | π CI/CD workflows, automation & best practices |
| Task Automation | β‘ Complete taskipy command reference |
| Document | Purpose |
|---|---|
| Monorepo Guide | π¦ Managing Python monorepo with shared dependencies |
| Testing Guide | π§ͺ Comprehensive testing strategies and patterns |
| Logging Guide | π Structured logging standards and practices |
| Python Version Management | π Managing Python 3.13+ across the project |
| Document | Purpose |
|---|---|
| Docker Security | π Container hardening & security practices |
| Dockerfile Standards | π Best practices for writing Dockerfiles |
| Database Resilience | πΎ Database connection patterns & error handling |
| Performance Guide | β‘ Performance optimization strategies |
| Document | Purpose |
|---|---|
| Consumer Cancellation | π File completion and consumer lifecycle |
| Platform Targeting | π― Cross-platform compatibility |
| Emoji Guide | π Standardized emoji usage |
| Recent Improvements | π Latest platform enhancements |
| Service Guides | π Individual README for each service |
| Requirement | Minimum | Recommended | Notes |
|---|---|---|---|
| Python | 3.13+ | Latest | Install via uv |
| Docker | 20.10+ | Latest | With Docker Compose v2 |
| Storage | 100GB | 200GB SSD | For data + processing |
| Memory | 8GB | 16GB+ | More RAM = faster processing |
| Network | 10 Mbps | 100 Mbps+ | Initial download ~50GB |
# 1. Clone and navigate to the repository
git clone https://github.com/SimplicityGuy/discogsography.git
cd discogsography
# 2. Copy environment template (optional - has sensible defaults)
cp .env.example .env
# 3. Start all services (default: Python Extractor)
docker-compose up -d
# 3b. (Optional) Use high-performance Rust Extractor instead
./scripts/switch-extractor.sh rust
# To switch back to Python Extractor: ./scripts/switch-extractor.sh python
# 4. Watch the magic happen!
docker-compose logs -f
# 5. Access the dashboard
open http://localhost:8003| Service | URL | Default Credentials | Purpose |
|---|---|---|---|
| π Dashboard | http://localhost:8003 | None | System monitoring |
| π΅ Discovery | http://localhost:8005 | None | AI music discovery |
| π° RabbitMQ | http://localhost:15672 | discogsography / discogsography |
Queue management |
| π Neo4j | http://localhost:7474 | neo4j / discogsography |
Graph exploration |
| π PostgreSQL | localhost:5433 |
discogsography / discogsography |
Database access |
# 1. Install uv (10-100x faster than pip)
curl -LsSf https://astral.sh/uv/install.sh | sh
# 2. Install just (task runner)
brew install just # macOS
# or: cargo install just
# or: https://just.systems/install.sh
# 3. Install all dependencies
just install
# 4. Set up pre-commit hooks
just init
# 5. Run any service
just dashboard # Monitoring UI
just discovery # AI discovery
just pyextractor # Python data ingestion
just rustextractor-run # Rust data ingestion (requires cargo)
just graphinator # Neo4j builder
just tableinator # PostgreSQL builderCreate a .env file or export variables:
# Core connections
export AMQP_CONNECTION="amqp://guest:guest@localhost:5672/"
# Neo4j settings
export NEO4J_ADDRESS="bolt://localhost:7687"
export NEO4J_USERNAME="neo4j"
export NEO4J_PASSWORD="password"
# PostgreSQL settings
export POSTGRES_ADDRESS="localhost:5433"
export POSTGRES_USERNAME="postgres"
export POSTGRES_PASSWORD="password"
export POSTGRES_DATABASE="discogsography"All configuration is managed through environment variables. Copy .env.example to .env:
cp .env.example .env| Variable | Description | Default | Used By |
|---|---|---|---|
AMQP_CONNECTION |
RabbitMQ URL | amqp://guest:guest@localhost:5672/ |
All services |
DISCOGS_ROOT |
Data storage path | /discogs-data |
Python/Rust Extractors |
PERIODIC_CHECK_DAYS |
Update check interval | 15 |
Python/Rust Extractors |
PYTHON_VERSION |
Python version for builds | 3.13 |
Docker, CI/CD |
| Variable | Description | Default | Used By |
|---|---|---|---|
NEO4J_ADDRESS |
Neo4j bolt URL | bolt://localhost:7687 |
Graphinator, Dashboard, Discovery |
NEO4J_USERNAME |
Neo4j username | neo4j |
Graphinator, Dashboard, Discovery |
NEO4J_PASSWORD |
Neo4j password | Required | Graphinator, Dashboard, Discovery |
POSTGRES_ADDRESS |
PostgreSQL host:port | localhost:5432 |
Tableinator, Dashboard, Discovery |
POSTGRES_USERNAME |
PostgreSQL username | postgres |
Tableinator, Dashboard, Discovery |
POSTGRES_PASSWORD |
PostgreSQL password | Required | Tableinator, Dashboard, Discovery |
POSTGRES_DATABASE |
Database name | discogsography |
Tableinator, Dashboard, Discovery |
| Variable | Description | Default | Used By |
|---|---|---|---|
CONSUMER_CANCEL_DELAY |
Seconds before canceling idle consumers after file completion | 300 (5 min) |
Graphinator, Tableinator |
QUEUE_CHECK_INTERVAL |
Seconds between queue checks when all consumers are idle | 3600 (1 hr) |
Graphinator, Tableinator |
π Note: The consumer management system implements smart connection lifecycle management:
- Automatic Idle Detection: When all consumers complete processing, RabbitMQ connections are automatically closed to conserve resources
- Periodic Queue Checking: Every
QUEUE_CHECK_INTERVALseconds, the service briefly connects to check for new messages in all queues- Auto-Reconnection: When new messages are detected, connections are re-established and consumers restart automatically
- Silent When Idle: Progress logging stops when all queues are complete to reduce log noise
This ensures efficient resource usage while maintaining automatic responsiveness to new data.
| Data Type | Record Count | XML Size | Processing Time |
|---|---|---|---|
| π Releases | ~15 million | ~40GB | 1-3 hours |
| π€ Artists | ~2 million | ~5GB | 15-30 mins |
| π΅ Masters | ~2 million | ~3GB | 10-20 mins |
| π’ Labels | ~1.5 million | ~2GB | 10-15 mins |
π Total: ~20 million records β’ 50GB compressed β’ 100GB processed
Once your data is loaded, explore the music universe through powerful queries and AI-driven insights.
Navigate the interconnected world of music with Cypher queries:
MATCH (a:Artist {name: "Pink Floyd"})-[:BY]-(r:Release)
RETURN r.title, r.year
ORDER BY r.year
LIMIT 10MATCH (member:Artist)-[:MEMBER_OF]->(band:Artist {name: "The Beatles"})
RETURN member.name, member.real_nameMATCH (r:Release)-[:ON]->(l:Label {name: "Blue Note"})
WHERE r.year >= 1950 AND r.year <= 1970
RETURN r.title, r.artist, r.year
ORDER BY r.yearMATCH (a1:Artist {name: "Miles Davis"})-[:COLLABORATED_WITH]-(a2:Artist)
RETURN DISTINCT a2.name
ORDER BY a2.nameFast structured queries on denormalized data:
SELECT
data->>'title' as title,
data->>'artist' as artist,
data->>'year' as year
FROM releases
WHERE data->>'title' ILIKE '%dark side%'
ORDER BY (data->>'year')::int DESC
LIMIT 10;SELECT
data->>'title' as title,
data->>'year' as year,
data->'genres' as genres
FROM releases
WHERE data->>'artist' = 'Miles Davis'
AND (data->>'year')::int BETWEEN 1950 AND 1960
ORDER BY (data->>'year')::int;SELECT
genre,
COUNT(*) as release_count,
MIN((data->>'year')::int) as first_release,
MAX((data->>'year')::int) as last_release
FROM releases,
jsonb_array_elements_text(data->'genres') as genre
GROUP BY genre
ORDER BY release_count DESC
LIMIT 20;Access the real-time monitoring dashboard at http://localhost:8003:
- Service Health: Live status of all microservices
- Queue Metrics: Message rates, depths, and consumer counts
- Database Stats: Connection pools and storage usage
- Activity Log: Recent system events and processing updates
- WebSocket Updates: Real-time data without page refresh
Monitor and debug your system with built-in tools:
# Check service logs for errors
uv run task check-errors
# Monitor RabbitMQ queues in real-time
uv run task monitor
# Comprehensive system health dashboard
uv run task system-monitor
# View logs for all services
uv run task logsEach service provides detailed telemetry:
- Processing Rates: Records/second for each data type
- Queue Health: Depth, consumer count, throughput
- Error Tracking: Failed messages, retry counts
- Performance: Processing time, memory usage
- Stall Detection: Alerts when processing stops
The project leverages cutting-edge Python tooling:
| Tool | Purpose | Configuration |
|---|---|---|
| uv | 10-100x faster package management | pyproject.toml |
| ruff | Lightning-fast linting & formatting | pyproject.toml |
| mypy | Strict static type checking | pyproject.toml |
| bandit | Security vulnerability scanning | pyproject.toml |
| pre-commit | Git hooks for code quality | .pre-commit-config.yaml |
Comprehensive test coverage with multiple test types:
# Run all tests (excluding E2E)
uv run task test
# Run with coverage report
uv run task test-cov
# Run specific test suites
uv run pytest tests/extractor/ # Extractor tests (Python)
uv run pytest tests/graphinator/ # Graphinator tests
uv run pytest tests/tableinator/ # Tableinator tests
uv run pytest tests/dashboard/ # Dashboard tests# One-time browser setup
uv run playwright install chromium
uv run playwright install-deps chromium
# Run E2E tests (automatic server management)
uv run task test-e2e
# Run with specific browser
uv run pytest tests/dashboard/test_dashboard_ui.py -m e2e --browser firefox# Setup development environment
uv sync --all-extras
uv run task init # Install pre-commit hooks
# Before committing
just lint # Run linting
just format # Format code
uv run task test # Run tests
just security # Security scan
# Or run everything at once
uv run pre-commit run --all-filesdiscogsography/
βββ π¦ common/ # Shared utilities and configuration
β βββ config.py # Centralized configuration management
β βββ health_server.py # Health check endpoint server
βββ π dashboard/ # Real-time monitoring dashboard
β βββ dashboard.py # FastAPI backend with WebSocket
β βββ static/ # Frontend HTML/CSS/JS
βββ π₯ extractor/ # Data extraction services
β βββ pyextractor/ # Python-based Discogs data ingestion
β β βββ extractor.py # Main processing logic
β β βββ discogs.py # S3 download and validation
β βββ rustextractor/ # Rust-based high-performance extractor
β βββ src/
β β βββ main.rs # Rust processing logic
β βββ Cargo.toml # Rust dependencies
βββ π graphinator/ # Neo4j graph database service
β βββ graphinator.py # Graph relationship builder
βββ π tableinator/ # PostgreSQL storage service
β βββ tableinator.py # Relational data management
βββ π§ utilities/ # Operational tools
β βββ check_errors.py # Log analysis
β βββ monitor_queues.py # Real-time queue monitoring
β βββ system_monitor.py # System health dashboard
βββ π§ͺ tests/ # Comprehensive test suite
βββ π docs/ # Additional documentation
βββ π docker-compose.yml # Container orchestration
βββ π¦ pyproject.toml # Project configuration
All logger calls (logger.info, logger.warning, logger.error) in this project follow a consistent emoji pattern for visual clarity. Each message starts with an emoji followed by exactly one space before the message text.
| Emoji | Usage | Example |
|---|---|---|
| π | Startup messages | logger.info("π Starting service...") |
| β | Success/completion messages | logger.info("β
Operation completed successfully") |
| β | Errors | logger.error("β Failed to connect to database") |
| Warnings | logger.warning("β οΈ Connection timeout, retrying...") |
|
| π | Shutdown/stop messages | logger.info("π Shutting down gracefully") |
| π | Progress/statistics | logger.info("π Processed 1000 records") |
| π₯ | Downloads | logger.info("π₯ Starting download of data") |
| β¬οΈ | Downloading files | logger.info("β¬οΈ Downloading file.xml") |
| π | Processing operations | logger.info("π Processing batch of messages") |
| β³ | Waiting/pending | logger.info("β³ Waiting for messages...") |
| π | Metadata operations | logger.info("π Loaded metadata from cache") |
| π | Checking/searching | logger.info("π Checking for updates...") |
| π | File operations | logger.info("π File created successfully") |
| π | New versions | logger.info("π Found newer version available") |
| β° | Periodic operations | logger.info("β° Running periodic check") |
| π§ | Setup/configuration | logger.info("π§ Creating database indexes") |
| π° | RabbitMQ connections | logger.info("π° Connected to RabbitMQ") |
| π | Neo4j connections | logger.info("π Connected to Neo4j") |
| π | PostgreSQL operations | logger.info("π Connected to PostgreSQL") |
| πΎ | Database save operations | logger.info("πΎ Updated artist ID=123 in Neo4j") |
| π₯ | Health server | logger.info("π₯ Health server started on port 8001") |
| β© | Skipping operations | logger.info("β© Skipped artist ID=123 (no changes)") |
logger.info("π Starting Discogs data extractor")
logger.error("β Failed to connect to Neo4j: connection refused")
logger.warning("β οΈ Slow consumer detected, processing delayed")
logger.info("β
All files processed successfully")The graph database models complex music industry relationships:
| Node | Description | Key Properties |
|---|---|---|
Artist |
Musicians, bands, producers | id, name, real_name, profile |
Label |
Record labels and imprints | id, name, profile, parent_label |
Master |
Master recordings | id, title, year, main_release |
Release |
Physical/digital releases | id, title, year, country, format |
Genre |
Musical genres | name |
Style |
Sub-genres and styles | name |
π€ Artist Relationships:
βββ MEMBER_OF βββββββ Artist (band membership)
βββ ALIAS_OF ββββββββ Artist (alternative names)
βββ COLLABORATED_WITH β Artist (collaborations)
βββ PERFORMED_ON ββββ Release (credits)
π Release Relationships:
βββ BY βββββββββββββ Artist (performer credits)
βββ ON βββββββββββββ Label (release label)
βββ DERIVED_FROM βββ Master (master recording)
βββ IS βββββββββββββ Genre (genre classification)
βββ IS βββββββββββββ Style (style classification)
π’ Label Relationships:
βββ SUBLABEL_OF ββββ Label (parent/child labels)
π΅ Classification:
βββ Style -[:PART_OF]β Genre (hierarchy)
Optimized for fast queries and full-text search:
-- Artists table with JSONB for flexible schema
CREATE TABLE artists (
data_id VARCHAR PRIMARY KEY,
hash VARCHAR NOT NULL UNIQUE,
data JSONB NOT NULL
);
CREATE INDEX idx_artists_name ON artists ((data->>'name'));
CREATE INDEX idx_artists_gin ON artists USING GIN (data);
-- Labels table
CREATE TABLE labels (
data_id VARCHAR PRIMARY KEY,
hash VARCHAR NOT NULL UNIQUE,
data JSONB NOT NULL
);
CREATE INDEX idx_labels_name ON labels ((data->>'name'));
-- Masters table
CREATE TABLE masters (
data_id VARCHAR PRIMARY KEY,
hash VARCHAR NOT NULL UNIQUE,
data JSONB NOT NULL
);
CREATE INDEX idx_masters_title ON masters ((data->>'title'));
CREATE INDEX idx_masters_year ON masters ((data->>'year'));
-- Releases table with extensive indexing
CREATE TABLE releases (
data_id VARCHAR PRIMARY KEY,
hash VARCHAR NOT NULL UNIQUE,
data JSONB NOT NULL
);
CREATE INDEX idx_releases_title ON releases ((data->>'title'));
CREATE INDEX idx_releases_artist ON releases ((data->>'artist'));
CREATE INDEX idx_releases_year ON releases ((data->>'year'));
CREATE INDEX idx_releases_gin ON releases USING GIN (data);Typical processing rates on modern hardware:
| Service | Records/Second | Bottleneck |
|---|---|---|
| π₯ Python Extractor | 5,000-10,000 | XML parsing, I/O |
| β‘ Rust Extractor | 20,000-400,000+ | Network I/O (Rust-based) |
| π Graphinator | 1,000-2,000 | Neo4j transactions |
| π Tableinator | 3,000-5,000 | PostgreSQL inserts |
- CPU: 4 cores
- RAM: 8GB
- Storage: 200GB HDD
- Network: 10 Mbps
- CPU: 8+ cores
- RAM: 16GB+
- Storage: 200GB+ SSD (NVMe preferred)
- Network: 100 Mbps+
Neo4j Configuration:
# neo4j.conf
dbms.memory.heap.initial_size=4g
dbms.memory.heap.max_size=4g
dbms.memory.pagecache.size=2gPostgreSQL Configuration:
-- postgresql.conf
shared_buffers = 4GB
work_mem = 256MB
maintenance_work_mem = 1GB
effective_cache_size = 12GB# RabbitMQ prefetch for consumers
PREFETCH_COUNT: 100 # Adjust based on processing speed- Use SSD/NVMe for
/discogs-datadirectory - Enable compression for PostgreSQL tables
- Configure Neo4j for SSD optimization
- Use separate disks for databases if possible
# Check connectivity
curl -I https://discogs-data-dumps.s3.us-west-2.amazonaws.com
# Verify disk space
df -h /discogs-data
# Check permissions
ls -la /discogs-dataSolutions:
- β Ensure internet connectivity
- β Verify 100GB+ free space
- β Check directory permissions
# Check RabbitMQ status
docker-compose ps rabbitmq
docker-compose logs rabbitmq
# Test connection
curl -u discogsography:discogsography http://localhost:15672/api/overviewSolutions:
- β Wait for RabbitMQ startup (30-60s)
- β Check firewall settings
- β
Verify credentials in
.env
Neo4j:
# Check Neo4j status
docker-compose logs neo4j
curl http://localhost:7474
# Test bolt connection
echo "MATCH (n) RETURN count(n);" | cypher-shell -u neo4j -p discogsographyPostgreSQL:
# Check PostgreSQL status
docker-compose logs postgres
# Test connection
PGPASSWORD=discogsography psql -h localhost -U discogsography -d discogsography -c "SELECT 1;"-
π Check Service Health
curl http://localhost:8000/health # Python/Rust Extractor curl http://localhost:8001/health # Graphinator curl http://localhost:8002/health # Tableinator curl http://localhost:8003/health # Dashboard curl http://localhost:8004/health # Discovery
-
π Monitor Real-time Logs
# All services uv run task logs # Specific service docker-compose logs -f extractor-python # For Python Extractor docker-compose logs -f extractor-rust # For Rust Extractor
-
π Analyze Errors
# Check for errors across all services uv run task check-errors # Monitor queue health uv run task monitor
-
ποΈ Verify Data Storage
-- Neo4j: Check node counts MATCH (n) RETURN labels(n)[0] as type, count(n) as count;
-- PostgreSQL: Check table counts SELECT 'artists' as table_name, COUNT(*) FROM artists UNION ALL SELECT 'releases', COUNT(*) FROM releases UNION ALL SELECT 'labels', COUNT(*) FROM labels UNION ALL SELECT 'masters', COUNT(*) FROM masters;
We welcome contributions! Please follow these guidelines:
-
Fork & Clone
git clone https://github.com/YOUR_USERNAME/discogsography.git cd discogsography -
Setup Development Environment
uv sync --all-extras uv run task init # Install pre-commit hooks -
Create Feature Branch
git checkout -b feature/amazing-feature
-
Make Changes
- Write clean, documented code
- Add comprehensive tests
- Update relevant documentation
-
Validate Changes
just lint # Fix any linting issues just test # Ensure tests pass just security # Check for vulnerabilities
-
Commit with Conventional Commits
git commit -m "feat: add amazing feature" # Types: feat, fix, docs, style, refactor, test, chore
-
Push & Create PR
git push origin feature/amazing-feature
- Code Style: Follow ruff and black formatting
- Type Hints: Required for all functions
- Tests: Maintain >80% coverage
- Docs: Update README and docstrings
- Logging: Use emoji conventions (see above)
- Security: Pass bandit checks
Keep dependencies up-to-date with the provided upgrade script:
# Safely upgrade all dependencies (minor/patch versions)
./scripts/upgrade-packages.sh
# Preview what would be upgraded
./scripts/upgrade-packages.sh --dry-run
# Include major version upgrades
./scripts/upgrade-packages.sh --majorThe script includes:
- π Automatic backups before upgrades
- β Git safety checks (requires clean working directory)
- π§ͺ Automatic testing after upgrades
- π¦ Comprehensive dependency management across all services
See scripts/README.md for more maintenance scripts.
This project is licensed under the MIT License - see the LICENSE file for details.
- π΅ Discogs for providing the monthly data dumps
- π The Python community for excellent libraries and tools
- π All contributors who help improve this project
- π uv for blazing-fast package management
- π₯ Ruff for lightning-fast linting
- π Bug Reports: GitHub Issues
- π‘ Feature Requests: GitHub Discussions
- π¬ Questions: Discussions Q&A
- π Complete Documentation Index - All guides and references
- π€ CLAUDE.md - AI development guide
- π¦ Service Documentation - README in each service directory
This project is actively maintained. We welcome contributions, bug reports, and feature requests!