GitHub - AssemblyAI-Community/simple-rtaa-demo

Real-Time Call Service Agent Assist Demo

Overview

This project demonstrates a real-time, AI-powered agent assist workflow for customer support calls. It captures audio from a loopback device (BlackHole) or microphone, transcribes in near real-time via AssemblyAI, and runs lightweight agents (powered by Cerebras) to label speakers, match the caller to CRM data, and generate actionable guidance for the human agent.

Key Features

BlackHole audio loopback for clean capture (recommended)
Live microphone capture (PyAudio)
Near real-time transcription (AssemblyAI Universal-Streaming)
Speaker role identification (STAFF vs CUSTOMER)
Caller-to-CRM matching using files in crm/
Agent recommendations using knowledge_base/ + current transcript
Continuous logging to logs/

Architecture

transcription.py — streams audio (file or live) to AssemblyAI and appends formatted turns to logs/assemblyai.log
agents.py — tails assemblyai.log and maintains:
- transcripts.log (speaker role labels)
- customer.log (best-match CRM record)
- recommendations.log (actionable guidance for STAFF)
rtaa.py — convenience launcher to run both the transcriber and agents together

Mock Data

knowledge_base/: fictitious policies, product details, and technical guides
crm/: fictitious customer profiles
sample_call.wav: a stereo sample call aligned to the mock data

Prerequisites

Python 3.9+ (tested with 3.13 via venv)
macOS or Linux (tested on macOS)
For loopback capture: BlackHole installed and selected as input

Install

Create and activate a virtual environment

python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

Install dependencies

pip install -r requirements.txt

Configure

Create a .env in the project root with your API keys (or copy from sample):

cp .env.sample .env

Then edit .env:

ASSEMBLYAI_API_KEY=your_assemblyai_key
CEREBRAS_API_KEY=your_cerebras_key

Run

File mode (default): streams both channels of sample_call.wav

python rtaa.py --mode file -f ./sample_call.wav

Live mode: stream from BlackHole or microphone

# BlackHole input (preferred if you want to play audio locally and capture it)
python rtaa.py --mode live -i blackhole

# Microphone input
python rtaa.py --mode live -i microphone

Flags

--mode {file,live}: file streams stereo WAV; live streams mono at 16 kHz
-f, --filepath: WAV path for file mode (defaults to ./sample_call.wav)
-i, --input {blackhole,microphone}: input device for live mode

Expected Console Output (abridged)

✅ Log reset: ./logs/assemblyai.log
🔊 Playing audio: ./sample_call.wav
🚀 Realtime agent started. Press Ctrl+C to stop.
[Speaker 1] 🎬 Session Begin: id=...
[agents] 🧹 Reset transcripts.log, customer.log, recommendations.log
[agents] 📡 Monitoring assemblyai.log ... (Ctrl+C to stop)
Speaker 1: Thank you for calling...
Speaker 2: Hi, this is John...

What Gets Written

logs/assemblyai.log: raw, formatted turns from AssemblyAI
logs/transcripts.log: speaker roles labeled per line (STAFF/CUSTOMER)
logs/customer.log: the currently selected CRM record (overwrites on change)
logs/recommendations.log: short list of prioritized actions for STAFF

Notes on Audio Sources

File mode expects a stereo WAV. Each channel is streamed separately to improve speaker separation.
Live mode uses 16 kHz mono streaming. If -i blackhole is chosen, the app also attempts local pass-through audio.

Troubleshooting

Missing ASSEMBLYAI_API_KEY: ensure your .env is loaded or environment variable is set.
No audio captured in live mode:
- macOS: System Settings › Privacy & Security › Microphone → allow Terminal/IDE
- BlackHole not found: install and select BlackHole as the input device; the app falls back to the default mic if not found
Port or device busy: disable local playback in transcription.py by setting ENABLE_LOCAL_PLAYBACK = False if your driver conflicts
cerebras_cloud_sdk import errors: verify installation from requirements.txt and the CEREBRAS_API_KEY in .env

Development Tips

Run python transcription.py --mode file -f ./sample_call.wav to test transcription alone
Run python agents.py separately to tail assemblyai.log and regenerate outputs
Logs are plain text; delete or edit them freely between runs

Limitations

Speaker role classification and CRM matching rely on LLM outputs and can be imperfect
The demo uses small local text files as its “CRM” and “KB”; it is not connected to a real backend

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
crm		crm
knowledge_base		knowledge_base
logs		logs
.env.sample		.env.sample
.gitignore		.gitignore
README.md		README.md
REFERENCES.md		REFERENCES.md
agents.py		agents.py
requirements.txt		requirements.txt
rtaa.py		rtaa.py
sample_call.wav		sample_call.wav
transcription.py		transcription.py

AssemblyAI-Community/simple-rtaa-demo

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages