Cache-aware streaming ASR on Modal with H100.
modal deploy nemo_asr_modal.py- Connect to
wss://<modal-url>/ws - Wait for
READY - Send audio chunks (16kHz, 16-bit PCM)
- Receive JSON:
{"text": "...", "is_final": false} - Send
ENDwhen done - Receive final response with
is_final: true