A secure, enterprise-grade voice assistant solution powered by Amazon Nova Sonic, with WebRTC connectivity through Daily, and fully hosted on AWS infrastructure.
- Voice AI Assistant
The Voice AI Assistant is a comprehensive speech-to-speech conversational AI system designed specifically for enterprise environments. It leverages cutting-edge technologies to provide a secure, scalable, and feature-rich voice interaction experience.
- Amazon Nova Sonic Integration: Utilizes Amazon's advanced large language model for natural, contextual conversations
- Real-time Communication: WebRTC connectivity through Daily's transport layer enables high-quality, low-latency audio streaming
- Enterprise Security: All data remains within your AWS environment, with no third-party data processing
- Secure Recording Storage: Voice recordings stored in designated S3 buckets with comprehensive access controls
- Containerized Deployment: Fully containerized solution for consistent deployment across environments
- Scalable Infrastructure: AWS-native architecture designed to scale with your organization's needs
The Voice AI Assistant is built on a modern, distributed architecture that enables real-time bidirectional voice communication between users and an AI assistant. The system consists of three main components:
This project includes the following diagrams to help understand the system architecture:
-
High-Level AWS Architecture: Shows the overall AWS services used in the solution including Fargate, CloudFront, S3, and Amazon Bedrock.
-
Component Diagram: Illustrates the relationships between client components, Daily.co rooms, and backend services.
-
Data Flow Diagram: Demonstrates how information flows through the system, from user input through WebRTC to backend processing and back to the user.
The server implementation provides the foundation for the speech-to-speech system:
- Built with FastAPI for high-performance API endpoints
- Manages client connections and authentication
- Creates Daily rooms for WebRTC communication
- Integrates with Amazon Nova Sonic for AI processing
- Handles recording management and storage
Key server endpoints:
/connect
: Creates a Daily room and returns connection credentials/health
: Health check endpoint for monitoring/status/{pid}
: Gets the status of a specific bot process
For detailed implementation, see the Backend Architecture section.
The client implementation establishes and manages the WebRTC connection:
- Uses the RTVI client library for standardized communication
- Connects to the server's
/connect
endpoint to obtain room credentials - Establishes a WebRTC connection via Daily transport
- Handles connection state changes and reconnection logic
- Provides a simple but effective user interface
For detailed implementation, see the Frontend Architecture section.
The entire solution is deployed on AWS, leveraging various services:
- Compute: AWS Fargate for containerized applications
- Backend Service: X86_64 architecture with 2048MB memory and 1024 vCPU
- Frontend Service: ARM64 architecture with 512MB memory and 256 vCPU
- Networking: Application Load Balancers, VPC, security groups
- Storage: S3 buckets for recording storage
- Security: IAM roles, Parameter Store for secrets
- AI Processing: Amazon Bedrock with Nova Sonic model
- S3 Direct Access: Secure access to recordings stored in S3 buckets
The architecture choice of X86_64 for the backend ensures compatibility with specialized audio processing libraries, while the ARM64 architecture for the frontend optimizes for cost-efficiency. This hybrid approach balances performance requirements with operational costs.
For deployment instructions, see the Deployment section.
The Voice AI Assistant is designed with enterprise security requirements in mind:
- All data processing occurs within your AWS environment
- No data is sent to third-party services outside of AWS
- End-to-end encryption for all communications
- Secure storage of recordings in designated S3 buckets
- Fine-grained IAM permissions for all components
- Role-based access control for recordings and system management
- API Gateway with proper authentication for accessing recordings
- Secure parameter storage for sensitive configuration
- All components deployed within private subnets
- Traffic secured via HTTPS/TLS
- Web application firewall protection
- Network ACLs and security groups to restrict traffic
- Architecture designed to support compliance requirements
- Audit logging for all system activities
- Data retention policies configurable to meet regulatory needs
- Encryption at rest and in transit for all sensitive data
Voice recordings are a critical component of the system, with comprehensive management features:
- Storage: All recordings stored in designated S3 buckets using Daily's cloud recording service
- Access: Direct access to S3 with appropriate IAM permissions and policies
- Organization: Recordings organized by session, date, and user with hierarchical folder structure
- Retention: Configurable retention policies to meet compliance requirements
- Playback: Web interface for browsing and playing recordings with streaming capabilities
- Download: Secure download capabilities for authorized users
The recordings functionality allows users to:
- Access recordings stored in the S3 bucket using AWS SDK or pre-signed URLs
- Browse and manage recordings with appropriate authentication and authorization
- Stream audio and video files directly from S3
- Download files with proper content-type headers for different file types
This implementation maintains security through IAM policies and role-based access control while allowing direct access to recordings stored in S3 buckets.
The backend is implemented as a FastAPI server that manages the core functionality through a sophisticated pipeline architecture. The server directory contains the following key components:
The main FastAPI server implementation provides the foundation for the speech-to-speech system:
- FastAPI Framework: High-performance API endpoints with automatic OpenAPI documentation
- Daily Room Management: Creates and configures Daily rooms with appropriate parameters
- Bot Process Management: Spawns and monitors Amazon Nova Sonic bot instances as separate processes
- Connection Credentials: Generates and provides secure tokens for WebRTC connections
- CORS Configuration: Configured to allow cross-origin requests for frontend integration
- Lifespan Management: Handles startup and shutdown tasks, including resource cleanup
Key server endpoints:
@app.post("/connect")
async def rtvi_connect(request: Request) -> Dict[Any, Any]:
"""RTVI connect endpoint that creates a room and returns connection credentials."""
# Creates a Daily room with recording capabilities
# Starts the bot process
# Returns authentication bundle with room_url and token
@app.get("/status/{pid}")
def get_status(pid: int):
"""Get the status of a specific bot process."""
# Returns running status of the specified bot process
@app.get("/health")
async def health_check():
"""Health check endpoint."""
# Returns server status and timestamp
The bot implementation handles the core AI functionality through a pipeline architecture:
- Pipecat Framework: Modular pipeline for audio processing and AI integration
- Daily Transport: Establishes bidirectional WebRTC connections
- Voice Activity Detection: Uses SileroVADAnalyzer for precise speech detection
- Animation Processing: Visual feedback through sprite animation during conversation
- Event Handling: Comprehensive event system for client connections and recording management
# Pipeline architecture for audio processing and AI integration
pipeline = Pipeline(
[
transport.input(), # WebRTC input stream
rtvi, # RTVI event processor
context_aggregator.user(), # User context management
llm, # Amazon Nova Sonic LLM service
ta, # Talking animation processor
transport.output(), # WebRTC output stream
context_aggregator.assistant(), # Assistant context management
]
)
The backend integrates with Amazon Nova Sonic for advanced voice AI capabilities:
- AWSNovaSonicLLMService: Custom service implementation for Nova Sonic integration
- Voice Configuration: Configurable voice ID selection (default: "tiffany")
- AWS Authentication: Secure access using AWS credentials
- Bidirectional Audio Streaming: Real-time audio processing for conversation
- Context Management: Maintains conversation context for natural interactions
# Amazon Nova Sonic LLM service initialization
llm = AWSNovaSonicLLMService(
secret_access_key=NOVA_AWS_SECRET_ACCESS_KEY,
access_key_id=NOVA_AWS_ACCESS_KEY_ID,
region=os.getenv("NOVA_AWS_REGION", "us-east-1"),
voice_id=os.getenv("NOVA_VOICE_ID", "tiffany"), # Options: matthew, tiffany, amy
send_transcription_frames=True
)
The backend uses Daily's WebRTC transport for high-quality, low-latency audio streaming:
- DailyTransport: Handles WebRTC connection establishment and management
- Room Creation: Dynamically creates Daily rooms with appropriate configurations
- Token Generation: Secure token generation for authenticated connections
- Audio/Video Parameters: Configurable parameters for audio/video quality
- Event Handlers: Comprehensive event system for connection state management
# Daily transport configuration
transport = DailyTransport(
room_url,
token,
"Chatbot",
DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
video_in_enabled=True,
video_out_enabled=True,
video_out_width=1024,
video_out_height=576,
vad_analyzer=SileroVADAnalyzer(),
transcription_enabled=True,
),
)
The backend includes a comprehensive recording management system:
- S3 Integration: Recordings stored in designated S3 buckets
- Daily Recording Configuration: Configured during room creation
roomParams = DailyRoomParams(
properties=DailyRoomProperties(
enable_recording="cloud", # Daily's cloud recording service
recordings_bucket=RecordingsBucketConfig(
bucket_name=os.getenv("RECORDING_S3_BUCKETNAME", ""),
bucket_region=os.getenv("RECORDING_S3_REGION", ""),
assume_role_arn=os.getenv("RECORDING_ASSUME_ROLE_ARN", ""),
),
geo="ap-southeast-1"
)
)
- Bot Process Management:
- Bot processes run as separate Python processes for isolation
- Process IDs are tracked for monitoring and management
- Status endpoint (
/status/{pid}
) allows checking if a bot process is still running - Clean termination of bot processes on server shutdown
- Recording Control: Start/stop recording functionality tied to participant events
- Transcript Storage: Optional DynamoDB integration for transcript storage
- Conversation Tracking: Timestamps and user identification for all conversations
The backend includes a sophisticated transcript handling system:
- Real-time Transcription: Captures and processes conversation transcripts
- DynamoDB Integration: Optional storage of transcripts in DynamoDB
- Frontend Communication: Sends transcript updates to the frontend via WebRTC
- Conversation Organization: Timestamps and role identification for all messages
# Store conversation in DynamoDB
async def store_conversation(self, message, username=None):
"""Store conversation in DynamoDB."""
if not self.dynamodb_client:
return
try:
username = username or self.username
timestamp = datetime.now().isoformat()
conversation_id = f"{timestamp}"
item = {
"username": username,
"conversation_id": conversation_id,
"timestamp": timestamp,
"conversation": message
}
self.dynamodb_client.put_item(Item=item)
except Exception as e:
logger.error(f"Error storing conversation in DynamoDB: {e}")
The backend provides API functions that can be called by the LLM:
- Function Schema: Defines available functions and their parameters
- Job Questions: Retrieves interview questions for specific job positions
- Date/Time Functions: Provides current date and time information
- Function Registration: Registers functions with the LLM service
# Function schema definition
job_questions_function = FunctionSchema(
name="get_job_questions",
description="Get a specific interview question for a job position.",
properties={
"position": {
"type": "string",
"description": "The job position title to get interview questions for",
},
},
required=["position"],
)
The frontend is a modern Vite-based web application that provides a responsive and intuitive user interface for interacting with the Voice AI Assistant. It leverages WebRTC technology through the Daily transport layer to establish real-time audio and video communication with the backend services.
The main client implementation establishes and manages the WebRTC connection:
-
RTVIClient Class: Core client that handles the connection lifecycle
this.rtviClient = new RTVIClient({ transport: new DailyTransport(), params: { baseUrl: baseUrl, endpoints: { connect: '/connect', }, }, enableMic: true, enableCam: true, // Callbacks for connection events... });
-
Connection Management: Handles connection establishment, state changes, and disconnection
async connect() { // Initialize devices and establish connection await this.rtviClient.initDevices(); await this.rtviClient.connect(); this.videoManager.setupMediaTracks(this.botAudio); }
-
Event Handling System: Comprehensive event listeners for connection states, bot interactions, and transcripts
callbacks: { onConnected: () => { /* ... */ }, onDisconnected: () => { /* ... */ }, onTransportStateChanged: (state) => { /* ... */ }, onBotConnected: (participant) => { /* ... */ }, onUserTranscript: (data) => { /* ... */ }, onBotTranscript: (data) => { /* ... */ }, }
Handles all video-related functionality:
- Media Stream Management: Creates and manages video elements for both local user and bot
- Camera Controls: Provides UI controls for enabling/disabling the camera
- Track Handling: Sets up and manages WebRTC media tracks
setupMediaTracks(botAudio) { const tracks = this.rtviClient.tracks(); if (tracks.bot?.audio) { this.setupBotAudioTrack(tracks.bot.audio, botAudio); } if (tracks.bot?.video) { this.setupBotVideoTrack(tracks.bot.video); } }
- Error Handling: Comprehensive error handling for various failure scenarios
// Handle track failures this.rtviClient.on(RTVIEvent.TrackError, (error, track) => { this.log(`Track error for ${track?.kind || 'unknown'}: ${error.message}`); // Implement recovery strategies based on error type if (error.name === 'NotAllowedError') { // Handle permission errors this.displayPermissionError(); } else if (error.name === 'NotFoundError') { // Handle device not found errors this.handleDeviceNotFound(track?.kind); } });
- Audio Optimization: Low-latency audio configuration for responsive conversations
// Audio optimization for low latency setupBotAudioTrack(track, audioElement) { audioElement.setSinkId('default'); // For hardware acceleration if supported audioElement.preload = "auto"; // Preload audio data audioElement.defaultPlaybackRate = 1.0; // Additional optimizations for reduced latency audioElement.autoplay = true; // Start playing immediately when data is available audioElement.buffer = 0; // Minimal buffering to reduce delay // Attach track to audio element track.attach(audioElement); }
Provides UI and functionality for camera control:
- Camera State Management: Toggles camera on/off and manages state
- Permission Handling: Manages camera permissions and displays appropriate error messages
- Stream Management: Creates and manages local video streams
The frontend integrates with the backend services through several mechanisms:
-
WebRTC Connection via Daily Transport:
- Establishes a WebRTC connection to the backend using the Daily transport layer
- Connects to the
/connect
endpoint to obtain room credentials - Handles audio and video streaming in real-time
// Initialize the RTVI client with a DailyTransport this.rtviClient = new RTVIClient({ transport: new DailyTransport(), params: { baseUrl: baseUrl, endpoints: { connect: '/connect', }, }, // Additional configuration... });
-
Amazon Nova Sonic Integration:
- Receives audio streams from the Nova Sonic service via the backend
- Handles real-time transcription updates from the LLM
- Processes and displays bot responses
-
Event-Based Communication:
- Uses an event-driven architecture to handle various states and data flows
- Processes transcription events for both user and bot speech
onUserTranscript: (data) => { if (data.final) { this.log(`User: ${data.text}`); } }, onBotTranscript: (data) => { this.log(`Bot: ${data.text}`); },
The frontend provides a clean and intuitive user interface:
-
Video Containers:
- Dual video display showing both the user and the AI assistant
- Responsive layout that adapts to different screen sizes
-
Control Panel:
- Connect/Disconnect buttons to manage the session
- Camera toggle controls for enabling/disabling video
- Status indicators showing connection state
-
Debug Panel:
- Real-time logging of events and transcriptions
- Connection status monitoring
- Audio level statistics
-
Recording Access Interface (Planned):
- Interface for browsing and accessing recordings stored in S3
- Media playback capabilities for audio and video recordings
- Folder navigation and file download functionality
The frontend implements WebRTC connections via Daily transport with the following features:
-
Media Track Management:
- Automatic handling of audio and video tracks
- Track event listeners for starting and stopping media streams
this.rtviClient.on(RTVIEvent.TrackStarted, (track, participant) => { if (!participant?.local) { if (track.kind === 'audio') { this.setupBotAudioTrack(track, botAudio); } else if (track.kind === 'video') { this.setupBotVideoTrack(track); } } });
-
Connection State Management:
- Robust handling of connection state changes
- Automatic reconnection logic
- Graceful resource cleanup on disconnection
-
Audio Optimization:
- Low-latency audio configuration
this.botAudio.setSinkId('default'); // For hardware acceleration if supported this.botAudio.preload = "auto"; this.botAudio.defaultPlaybackRate = 1.0;
- Real-time audio level monitoring
- Voice activity detection integration
The frontend includes several important configuration settings:
-
Environment Variables:
VITE_BASE_URL
: Base URL for the backend API (default: "http://localhost:8000")VITE_S3_PROXY_API_ENDPOINT
: API Gateway endpoint for accessing recordings
-
Docker Containerization:
- Multi-stage build process for optimized production deployment
- NGINX configuration for serving static assets with proper caching and security headers
- Environment variable substitution at runtime via
env.sh
-
Build Configuration:
- Vite-based build system for fast development and optimized production builds
- PNPM package management for efficient dependency handling
-
NGINX Configuration:
- Optimized static asset serving with appropriate cache headers
- Gzip compression for improved performance
- Security headers for enhanced protection
- Client-side routing support
The frontend is designed to be deployed as a containerized application, with the container image built using a multi-stage Dockerfile that optimizes for production performance and security.
This section provides detailed instructions for deploying all components of the Voice AI Assistant system.
- AWS Account with appropriate permissions for:
- AWS CDK deployment
- Amazon Bedrock access for Nova Sonic
- S3, API Gateway, ECS/Fargate, and other required services
- AWS CLI installed and configured with access credentials
aws configure
- Node.js (v16+) and npm for CDK deployment
- Python (v3.9+) for backend development
- Docker installed for containerized deployments
- Daily.co account with API access for WebRTC functionality
- Git for repository management
-
Clone the Repository
git clone <repository-url> cd voice-ai-assistant
-
Create Environment Files
Create a
.env
file in the server directory:cd server cp .env.example .env
Edit the
.env
file with your specific configuration:# Daily API Configuration DAILY_API_KEY=<your-daily-api-key> DAILY_API_URL=https://api.daily.co/v1 # AWS Nova Sonic Configuration NOVA_AWS_ACCESS_KEY_ID=<your-aws-access-key> NOVA_AWS_SECRET_ACCESS_KEY=<your-aws-secret-key> NOVA_AWS_REGION=us-east-1 NOVA_VOICE_ID=tiffany # Recording Configuration RECORDING_S3_BUCKETNAME=<your-s3-bucket-name> RECORDING_S3_REGION=<your-s3-bucket-region> RECORDING_ASSUME_ROLE_ARN=<your-assume-role-arn>
-
Create Frontend Environment File
Create a
.env
file in the vite-client directory:cd ../vite-client cp .env.example .env
Edit the
.env
file with your specific configuration:VITE_BASE_URL=http://localhost:8000
-
Navigate to the CDK Directory
cd ../cdk
-
Install Dependencies
npm install
-
Build the CDK Project
npm run build
-
Bootstrap CDK (First-time only)
cdk bootstrap aws://<your-aws-account-id>/<your-aws-region>
-
Deploy the CDK Stack
cdk deploy
This will deploy:
- VPC and networking components
- ECS Fargate cluster for containerized applications
- S3 buckets for frontend assets and recordings
- IAM roles and policies
- CloudFront distribution for frontend
-
Note the Outputs
After deployment completes, note the following outputs from the CDK deployment:
FrontendURL
: URL for accessing the frontend applicationBackendURL
: URL for accessing the backend API
-
Navigate to the Server Directory
cd ../server
-
Build and Push the Docker Image
Option 1: Using AWS ECR (recommended for production):
aws ecr get-login-password --region <your-region> | docker login --username AWS --password-stdin <your-aws-account-id>.dkr.ecr.<your-region>.amazonaws.com docker build -t <your-aws-account-id>.dkr.ecr.<your-region>.amazonaws.com/voice-ai-assistant-backend:latest . docker push <your-aws-account-id>.dkr.ecr.<your-region>.amazonaws.com/voice-ai-assistant-backend:latest
Option 2: Local development:
docker build -t voice-ai-assistant-backend:latest . docker run -p 8000:8000 --env-file .env voice-ai-assistant-backend:latest
-
Update ECS Service (if needed)
If you've made changes to the backend and pushed a new image:
aws ecs update-service --cluster VoiceAIAssistantCluster --service BackendService --force-new-deployment
-
Navigate to the Vite Client Directory
cd ../vite-client
-
Install Dependencies
npm install # or pnpm install
-
Build the Frontend
npm run build # or pnpm build
-
Deploy Frontend Assets to S3
Option 1: Using AWS CLI:
aws s3 sync dist/ s3://<your-frontend-bucket-name>/ --delete
Option 2: Using Docker:
docker build -t voice-ai-assistant-frontend:latest . docker run -p 80:80 -e SONIC_APP_S3_PROXY_API_ENDPOINT=<your-api-gateway-endpoint> voice-ai-assistant-frontend:latest
-
Invalidate CloudFront Cache (if using CloudFront)
aws cloudfront create-invalidation --distribution-id <your-cloudfront-distribution-id> --paths "/*"
-
Daily Recordings Bucket CDK Stack (Recommended)
Daily provides an official AWS CDK stack to simplify the setup process for recordings. You can use the repository at daily-co/daily-recordings-bucket to streamline the creation of all necessary AWS resources.
a. Clone the Repository
git clone https://github.com/daily-co/daily-recordings-bucket.git cd daily-recordings-bucket
b. Install Dependencies and Deploy
npm install npx cdk bootstrap # If you haven't bootstrapped CDK before npx cdk deploy
c. Note the Outputs
After deployment, the CDK stack will output the bucket name and IAM role ARN to use in your application. These values will be used for
RECORDING_S3_BUCKETNAME
andRECORDING_ASSUME_ROLE_ARN
in your environment configuration.The CDK stack automatically:
- Creates an S3 bucket with proper versioning and configurations
- Sets up the correct IAM policies and cross-account trust relationships
- Configures all required permissions for Daily's recording service
Note: If you're integrating this with the Voice AI Assistant, you can incorporate this CDK stack into your existing CDK deployment or deploy it separately.
-
Manual Setup (Alternative)
If you prefer to set up the resources manually or need custom configurations, follow these steps:
a. Create S3 Bucket for Daily Recordings
This step is handled by the CDK deployment, but if you need to create it manually:
aws s3 mb s3://<your-recording-bucket-name> --region <your-region>
Important: Ensure that versioning is enabled for your S3 bucket. This is required by Daily for reliable recording storage.
b. Set up Cross-Account Trust for Daily
Daily requires cross-account access to write recordings to your S3 bucket. Follow these steps to configure the trust relationship:
i. Create an IAM Policy for Daily Access
Create a policy with the necessary permissions:
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "s3:PutObject", "s3:GetObject", "s3:ListBucket", "s3:DeleteObject" ], "Resource": [ "arn:aws:s3:::<your-recording-bucket-name>", "arn:aws:s3:::<your-recording-bucket-name>/*" ] } ] }
ii. Create a Cross-Account IAM Role
Create an IAM role with the following trust relationship to allow Daily's AWS account to assume the role:
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "AWS": "arn:aws:iam::291871421085:root" }, "Action": "sts:AssumeRole", "Condition": { "StringEquals": { "sts:ExternalId": "<your-daily-api-key>" } } } ] }
Note: The
ExternalId
condition is critical for security. Use your Daily API key as the external ID to prevent confused deputy attacks.iii. Attach the IAM Policy to the Role
Attach the policy created in step 2.b.i to the IAM role created in step 2.b.ii.
-
Configure Application Environment Variables
Update your
.env
file with the following variables:RECORDING_S3_BUCKETNAME=<your-recording-bucket-name> RECORDING_S3_REGION=<your-bucket-region> RECORDING_ASSUME_ROLE_ARN=<your-iam-role-arn>
The
RECORDING_ASSUME_ROLE_ARN
should be the ARN of the role you created in step 2.b.ii or obtained from the CDK stack outputs. -
Configure Daily Room Parameters
The server application configures Daily rooms with the appropriate recording parameters:
roomParams = DailyRoomParams( properties=DailyRoomProperties( enable_recording="cloud", # Use "cloud" for managed recordings recordings_bucket=RecordingsBucketConfig( bucket_name=os.getenv("RECORDING_S3_BUCKETNAME", ""), bucket_region=os.getenv("RECORDING_S3_REGION", ""), assume_role_arn=os.getenv("RECORDING_ASSUME_ROLE_ARN", ""), ), geo="ap-southeast-1" # Set appropriate region ) )
-
Enable Amazon Bedrock Access
a. Navigate to the Amazon Bedrock console in your AWS account.
b. Request access to the Amazon Nova Sonic model if you haven't already.
c. Create a model access role with appropriate permissions.
-
Configure Nova Sonic in the Backend
Update the
.env
file in the server directory with your Nova Sonic configuration:NOVA_AWS_ACCESS_KEY_ID=<your-aws-access-key> NOVA_AWS_SECRET_ACCESS_KEY=<your-aws-secret-key> NOVA_AWS_REGION=us-east-1 NOVA_VOICE_ID=tiffany # Options: matthew, tiffany, amy
-
Customize Voice Settings (Optional)
To customize the Nova Sonic voice settings, modify the
bot_bedrock_nova.py
file:llm = AWSNovaSonicLLMService( secret_access_key=NOVA_AWS_SECRET_ACCESS_KEY, access_key_id=NOVA_AWS_ACCESS_KEY_ID, region=os.getenv("NOVA_AWS_REGION", "us-east-1"), voice_id=os.getenv("NOVA_VOICE_ID", "tiffany"), send_transcription_frames=True )
-
Verify Backend Deployment
Test the backend health endpoint:
curl <your-backend-url>/health
Expected response:
{"status": "ok", "timestamp": "2025-06-30T07:18:47.123456"}
-
Verify Frontend Deployment
Open the frontend URL in a web browser:
https://<your-frontend-url>
Verify that the UI loads correctly and you can see the connection controls.
-
Test WebRTC Connection
a. Click the "Connect" button on the frontend.
b. Allow microphone and camera permissions when prompted.
c. Verify that the connection is established and you can see your video feed.
d. Speak into the microphone and verify that the AI assistant responds.
-
Verify Recording Functionality
a. Complete a conversation with the AI assistant.
b. Navigate to the recordings section at the bottom of the page.
c. Verify that your recording appears in the list.
d. Click on the recording to play it