Whisper_Azure is a production-grade setup for deploying WhisperX (OpenAI Whisper + word alignment) using Azure Container Instances (ACI) with GPU acceleration.
The project allows on-demand, fast, and local speech-to-text transcription through a RESTful API, packaged via Docker and ready for scalable usage.
✅ Ideal for teams needing scalable audio transcription without maintaining persistent servers.
- 🎙️ High-quality transcription with WhisperX
- 📁 Upload audio files (MP3, WAV, etc.) for transcription
- 🧠 Word-level alignment and diarization support
- ☁️ Hosted on Azure Container Instances with GPU
- 🔐 Secured with Azure Identity + Blob Storage
- 🐳 Docker-based for reproducible local/remote deployment
| Layer | Technology |
|---|---|
| Transcription | WhisperX (Python) |
| API | FastAPI |
| Deployment | Docker + Azure CLI |
| Storage | Azure Blob Storage |
| Auth | Azure Managed Identity |
| Trigger | HTTP endpoint (REST API) |
whisper_api/
├── app/
│ ├── main.py # FastAPI entrypoint
│ ├── transcribe.py # WhisperX logic
│ └── utils.py # Helpers (storage, config)
├── Dockerfile
├── requirements.txt
├── azure/
│ └── deploy.sh # ACI deployment script
└── README.md- Azure CLI (
az) - Docker
- An active Azure subscription
- Python 3.10+
- WhisperX installed locally (for dev mode)
cd whisper_api
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
uvicorn app.main:app --reload