Skip to content

[Feature Request] MONAI + FastAPI Inference Deployment Tutorial #2048

@engmohamedsalah

Description

@engmohamedsalah

Summary

Propose adding a comprehensive tutorial demonstrating how to deploy MONAI inference models as REST APIs using FastAPI.

Motivation

Currently, the MONAI tutorials repository includes deployment examples for BentoML, Ray, and Triton, but no tutorial exists for FastAPI - one of the most popular modern Python web frameworks for building APIs.

Why FastAPI?

  • Fast, modern, and intuitive Python web framework
  • Automatic API documentation generation (Swagger UI)
  • Async/await support for efficient I/O operations
  • Type hints and validation with Pydantic
  • Widely adopted in the ML deployment community
  • Lower barrier to entry than specialized ML serving frameworks

Use Case:
Many users need to:

  • Deploy MONAI models in production environments
  • Integrate MONAI models into web applications
  • Expose models via REST APIs for remote inference
  • Create accessible endpoints for medical imaging applications

This tutorial would fill a significant gap and provide a practical, production-ready deployment pattern.


Proposed Tutorial Content

What It Will Cover

  1. Model Setup

    • Load a pre-trained MONAI model bundle (spleen_ct_segmentation)
    • Understand bundle structure and components
  2. FastAPI Application

    • Create REST API with multiple endpoints:
      • GET /health - Health check
      • POST /predict - Inference endpoint
      • GET /docs - Auto-generated API documentation
    • Handle medical image uploads (NIfTI, DICOM formats)
    • Implement proper error handling and validation
    • Return predictions in standardized JSON format
  3. Best Practices

    • Singleton pattern for model loading (efficiency)
    • Async/await for I/O operations
    • Request/response validation with Pydantic
    • CORS configuration
    • Logging and monitoring
  4. Docker Deployment

    • Dockerfile with optimization (layer caching, etc.)
    • docker-compose for local development
    • Container best practices
  5. Testing & Usage

    • Unit tests for API endpoints
    • Example requests (curl, Python client)
    • Integration testing

Deliverables

The tutorial will include:

📁 File Structure

```
tutorials/deployment/fastapi_inference/
├── README.md # Complete tutorial guide
├── requirements.txt # All dependencies
├── app/
│ ├── init.py
│ ├── main.py # FastAPI application
│ ├── model_loader.py # MONAI bundle loading (singleton)
│ ├── inference.py # Inference pipeline
│ └── schemas.py # Pydantic models for validation
├── tests/
│ ├── test_api.py # Unit tests
│ └── sample_image.nii.gz # Test data
├── docker/
│ ├── Dockerfile # Optimized container
│ └── docker-compose.yml # Development setup
├── notebooks/
│ └── fastapi_tutorial.ipynb # Interactive walkthrough
└── examples/
├── sample_requests.http # Example API calls
└── client.py # Python client example
```

✅ Specific Deliverables

  • Complete, working FastAPI application
  • Jupyter notebook with step-by-step walkthrough
  • Comprehensive README with setup instructions
  • Docker deployment configuration
  • Unit tests with pytest
  • Example client code (Python + curl)
  • Sample test images and expected outputs
  • API usage documentation

Target Audience

  • ML engineers deploying MONAI models to production
  • Backend developers integrating medical AI into applications
  • DevOps teams packaging ML services
  • Researchers sharing models as accessible APIs
  • Healthcare application developers

Tutorial Learning Objectives

After completing this tutorial, users will be able to:

  1. Deploy a MONAI model bundle as a REST API
  2. Handle medical image uploads via HTTP
  3. Implement proper error handling and validation
  4. Containerize the application with Docker
  5. Test the API with various clients
  6. Understand production deployment considerations

Relation to Existing Tutorials

Complements, doesn't duplicate:

  • MONAI Deploy SDK: Different focus (packaged deployment vs. simple REST API)
  • BentoML/Ray/Triton: Different frameworks (FastAPI is more general-purpose)
  • Fills gap: No current tutorial for lightweight REST API deployment

Similar to:

  • Deployment tutorials in PyTorch, TensorFlow serving docs
  • Follows pattern of other ML framework API deployment guides

Implementation Plan

I'm willing to contribute this tutorial with the following timeline:

Week 1-2: Core implementation

  • FastAPI application structure
  • Model loading and inference pipeline
  • Basic endpoints with testing
  • Docker configuration

Week 3: Documentation & polish

  • Jupyter notebook creation
  • README with detailed instructions
  • Code comments and docstrings
  • Example usage documentation

Week 4: Review & iteration

  • Address maintainer feedback
  • Add requested features
  • Final testing across platforms
  • PR submission

Total estimated time: 20-30 hours over 3-4 weeks


Questions for Maintainers

Before starting, I'd appreciate guidance on:

  1. Model Selection: Is spleen_ct_segmentation the best choice, or would you recommend a different bundle?

  2. Folder Location: Should this go under tutorials/deployment/fastapi_inference/ or a different location?

  3. Scope: Any specific features you'd like included beyond what's proposed?

  4. Style: Any MONAI-specific patterns or conventions I should follow for FastAPI implementations?

  5. Testing: What level of test coverage do you expect for tutorials?


Technical Details

Dependencies:

  • FastAPI (latest stable)
  • Uvicorn (ASGI server)
  • MONAI (latest stable)
  • PyTorch
  • python-multipart (for file uploads)
  • Pydantic (for validation)

Tested on:

  • Python 3.9+
  • Linux/macOS/Windows (via Docker)

License:

  • Apache 2.0 (matching MONAI)
  • All code will include proper copyright headers

Benefits to MONAI Community

  1. Accessibility: Makes MONAI models more accessible via standard REST APIs
  2. Best Practices: Demonstrates modern Python API development patterns
  3. Production-Ready: Provides deployment-ready example code
  4. Popular Framework: FastAPI is widely known and adopted
  5. Documentation: Auto-generated API docs (Swagger UI)
  6. Easy Integration: Standard HTTP interface works with any client

References


Conclusion

This tutorial would provide significant value to the MONAI community by:

  • Filling an existing gap in deployment options
  • Demonstrating a popular, modern deployment pattern
  • Providing production-ready, tested code
  • Making MONAI models more accessible

I'm committed to delivering a high-quality tutorial that meets MONAI standards and am happy to iterate based on maintainer feedback.

Please let me know if this proposal is acceptable and if you have any suggestions or requirements!

Thank you for considering this contribution!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions