Realtime Multimodel Agent Service.
The project consists of three main components:
-
AI SDK (WebRTC): Captures and processes audio and video streams on the client side using the WebRTC protocol, including tasks such as audio/video encoding and preliminary inference.
-
WebRTC Gateway: Manages signaling, handles NAT/firewall traversal, and forwards media streams. It also supports load balancing with the AI Service.
-
AI Service: Provides real-time inference and data processing capabilities, including speech recognition, image recognition, real-time subtitle generation, speech synthesis, and interactive real-time large model interactions.