PulmoScan: High-Performance Medical Image Analysis

PulmoScan is a production-grade, asynchronous image classification system designed to detect COVID-19 from Chest X-Ray images.

Unlike typical ML demos, PulmoScan focuses on System Engineering and Scale. It leverages an optimized pipeline featuring Redis caching, Celery batch processing, and ONNX Runtime acceleration to achieve high throughput and low latency.

Key Features

Ultra-Fast Inference: Utilizes ONNX Runtime to deliver ~4x faster inference (10ms) compared to standard PyTorch CPU execution.
Smart Batch Processing: Automatically groups incoming requests into batches, achieving a ~2x speedup for bulk workloads.
Intelligent Caching: Implements SHA-256 perceptual hashing with Redis to instantly serve results for duplicate images (0ms latency).
Asynchronous Architecture: Decoupled architecture using Celery workers ensures the API remains responsive even under heavy load.
S3-Compatible Storage: Integrated with MinIO for scalable object storage of medical imagery.
Data Export: Built-in capability to export batch classification results to CSV for analysis.
Mobile-Ready: Includes quantized (INT8) and standard ONNX models optimized for edge deployment.

System Architecture

The system follows a microservices pattern orchestrated via Docker Compose:

graph LR
    Client([User/Client]) -->|"Upload ZIP/Image"| API[FastAPI Server]
    API -->|"Save File"| MinIO[("MinIO S3")]
    API -->|"Dispatch Task"| Redis[("Redis Broker")]
    
    subgraph Worker Node
        Worker[Celery Worker] -->|"Fetch Task"| Redis
        Worker -->|"Download Image"| MinIO
        Worker -->|"Inference (ONNX)"| Model["MobileNetV3"]
    end
    
    Worker -->|"Save Result"| DB[("PostgreSQL")]
    API -->|"Poll Status"| DB
    API -->|"Get Cache"| Redis

Performance Benchmarks

Tested on a standard local CPU environment (Windows, 4 threads):

Metric	Target	Achieved	Status
Single Inference	< 100ms	10.30 ms	EXCEEDED
Batch Speedup	2.0x	1.97x	ACHIEVED
ONNX Speedup	-	4.02x	FAST
Cache Latency	< 5ms	2.36 ms	INSTANT
Model Load	< 3s	< 2s	READY

Tech Stack

Framework: Python 3.12, FastAPI
ML Core: PyTorch, Torchvision, ONNX Runtime
Async/Queues: Celery, Redis
Database: PostgreSQL (SQLAlchemy + Pydantic)
Storage: MinIO (AWS S3 Compatible)
Environment: UV (Package Manager), Docker

Getting Started

Prerequisites

Docker & Docker Compose
Python 3.12+ (if running locally without Docker)
uv (recommended)

1. Clone & Setup

git clone https://github.com/nice-bills/pulmoscan.git
cd pulmoscan

# Install dependencies (local dev)
uv sync

2. Start Infrastructure (Docker)

Start the Database, Redis, and MinIO services:

docker-compose up -d

3. Run the System

You can run the API and Worker separately for easier debugging:

Terminal 1 (API):

uv run uvicorn app.main:app --reload --port 8000

Terminal 2 (Celery Worker):

# Windows
uv run celery -A app.workers.celery_app worker --loglevel=info -P threads --concurrency=4

# Linux/Mac
uv run celery -A app.workers.celery_app worker --loglevel=info --concurrency=4

Usage Guide

1. Classify a Single Image

Endpoint: POST /api/v1/jobs/classify

curl -X POST "http://localhost:8000/api/v1/jobs/classify" \
     -H "accept: application/json" \
     -H "Content-Type: multipart/form-data" \
     -F "file=@/path/to/xray.png"

2. Batch Processing (ZIP Upload)

Upload a ZIP file containing multiple images for background processing. Endpoint: POST /api/v1/jobs/batch

curl -X POST "http://localhost:8000/api/v1/jobs/batch" \
     -F "file=@/path/to/batch_images.zip"

Returns a job_id to track progress.

3. Export Results

Download the results of a completed batch job as a CSV file. Endpoint: GET /api/v1/jobs/{job_id}/results/download

Project Structure

pulmoscan/
├── app/
│   ├── api/            # API Routes (Jobs, Health)
│   ├── services/       # Core Logic (Model, Cache, Storage)
│   ├── workers/        # Celery Tasks
│   ├── models.py       # DB Models
│   └── main.py         # App Entrypoint
├── data/               # Local data storage (Gitignored)
├── docker/             # Docker configs
├── models/             # Trained .pth and .onnx models
├── scripts/            # Utils (Train, Export, Benchmark)
├── tests/              # Pytest suite
├── docker-compose.yml  # Dev Infrastructure
├── pyproject.toml      # Dependencies
└── README.md           # You are here

License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
app		app
scripts		scripts
temp_csv_test		temp_csv_test
.dockerignore		.dockerignore
.env		.env
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.prod.yml		docker-compose.prod.yml
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
test_api.py		test_api.py
test_redis_cache.py		test_redis_cache.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PulmoScan: High-Performance Medical Image Analysis

Key Features

System Architecture

Performance Benchmarks

Tech Stack

Getting Started

Prerequisites

1. Clone & Setup

2. Start Infrastructure (Docker)

3. Run the System

Usage Guide

1. Classify a Single Image

2. Batch Processing (ZIP Upload)

3. Export Results

Project Structure

License

About

Uh oh!

Releases

Packages

Languages

License

nice-bills/pulmoscan

Folders and files

Latest commit

History

Repository files navigation

PulmoScan: High-Performance Medical Image Analysis

Key Features

System Architecture

Performance Benchmarks

Tech Stack

Getting Started

Prerequisites

1. Clone & Setup

2. Start Infrastructure (Docker)

3. Run the System

Usage Guide

1. Classify a Single Image

2. Batch Processing (ZIP Upload)

3. Export Results

Project Structure

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages