- Table of Contents
- Course Overview
- Who is this course for?
- Course Breakdown: Week by Week
- Getting Started
- Lesson 0: Project Overview and Architecture
- Lesson 1: Building Realtime Voice Agents with FastRTC
- Lesson 2: The Missing Layer in Modern AI Retrieval
- The tech stack
- Contributors
- License
This isn't your typical plug-and-play tutorial where you spin up a demo in five minutes and call it a day.
Instead, we're building a real estate company, but with a twist … the employees will be realtime voice agents!
By the end of this course, you'll have a system capable of:
- ☎️ Receive inbound calls with Twilio
- 📞 Make outbound calls through Twilio
- 🏠 Search live property data using Superlinked
- ⚡ Run realtime conversations powered by FastRTC
- 🗣️ Transcribe speech instantly with Moonshine + Fast Whisper
- 🎙️ Generate lifelike voices using Kokoro + Orpheus 3B
- 🚀 Deploy open-source models on Runpod for GPU acceleration
Excited? Let's get started!
|
|
Join The Neural Maze and learn to build AI Systems that actually work, from principles to production. Every Wednesday, directly to your inbox. Don't miss out! |
|
Join Jesús Copado on YouTube to explore how to build real AI projects—from voice agents to creative tools. Weekly videos with code, demos, and ideas that push what's possible with AI. Don't miss the next drop! |
This course is for Software Engineers, ML Engineers, and AI Engineers who want to level up by building complex end-to-end apps. It's not just a basic "Hello World" tutorial—it's a deep dive into making production-ready voice agents.
Each week, you'll unlock a new chapter of the journey. You'll get:
- 🧾 A Substack article that walks through the concepts and code in detail
- 💻 A new batch of code pushed directly to this repo
- 🎥 A Live Session where we explore everything together
Here’s what the upcoming weeks look like 👇
| Lesson Number | Title | Article | Code | Live Session |
|---|---|---|---|---|
0 |
Project overview and architecture | ![]() |
Week 0 | |
1 |
Building Realtime Voice Agents with FastRTC | ![]() |
Week 1 | |
2 |
The Missing Layer in Modern AI Retrieval | ![]() |
Week 2 | November 30 |
3 |
Improving STT and TTS Systems | December 3 | December 3 | December 7 |
4 |
Deployment, monitoring and Twilio Integration | December 10 | December 10 | December 14 |
Before diving into the lessons, make sure you have everything set up properly:
- 📋 Initial Setup: Follow the instructions in
docs/GETTINGS_STARTED.mdto configure your environment and install dependencies. - 📚 Learn Lesson by Lesson: Once setup is complete, come back here and follow the lessons in order.
Each lesson builds on the previous one, so it's important to follow them sequentially!
Goal: Understand the big picture and architecture of the realtime phone agent system.
- 📖 Read the Substack article to understand the overall architecture
- 🎥 Watch the Live Session recording for a deeper dive
This lesson sets the foundation for everything that follows!
Goal: Build your first working voice agent using FastRTC and integrate it with Twilio.
- 📖 Read the Article: Start with the Substack article to understand FastRTC fundamentals
- 📓 Work Through the Notebook: Open and run through
notebooks/lesson_1_fastrtc_agents.ipynbto get hands-on experience - 💻 Explore the Code: Dive into the repository code to see how everything is implemented
- 🚀 Run the Applications: Try both deployment options:
Run the Gradio interface (check out demo videos in the Substack article):
make start-gradio-applicationThis starts an interactive web interface where you can test the voice agent locally.
NOTE: If you get the error 'No such file or directory: 'ffprobe', just install ffmpeg in your system to solve it
For a production-ready setup that can receive real phone calls:
Step 1: Start the call center application
make start-call-centerThis starts a FastAPI application using Docker Compose on port 8000.
Step 2: Expose your local server to the internet
make start-ngrok-tunnelOr manually:
ngrok http 8000Step 3: Connect to Twilio
Follow the instructions in the article to:
- Configure your Twilio account
- Connect your ngrok URL to Twilio
- Start receiving real phone calls!
🎥 Join the Live Session: Join Premium, and you'll receive a notification when we are live on Sunday, November 23rd at 5PM CET for a complete walkthrough and Q&A!
Goal: Learn how to implement advanced search capabilities for realtime voice agents using Superlinked to handle complex, multi-attribute queries.
-
📖 Read the Article: Start with the Substack article to understand:
- Why traditional vector search isn't enough for multi-attribute queries
- How Superlinked combines different data types (text, numbers, categories) into a unified search space
- The limitations of metadata filters, multiple searches, and re-ranking approaches
-
📓 Work Through the Notebook: Open and run through
notebooks/lesson_2_superlinked_property_search.ipynbto learn:- How to define different Space types (TextSimilaritySpace, NumberSpace, CategoricalSimilaritySpace)
- How to combine spaces into a single searchable index
- How to dynamically adjust weights at query time
-
💻 Explore the Code: Dive into the repository to see how Superlinked integrates with our voice agent:
- Check out
src/realtime_phone_agents/infrastructure/superlinked/for the implementation - Review
src/realtime_phone_agents/agent/tools/property_search.pyto see how the search tool is exposed to the agent
We'll explore the code in detail during the Live Session!
- Check out
-
🚀 Test the Complete System: Now it's time to see everything work together!
Step 1: Start the call center application
make start-call-center
Step 2: Expose your local server (if not already running)
make start-ngrok-tunnel
Step 3: Call your Twilio number and test the property search
Try asking the agent:
"Do you have apartments in Barrio de Salamanca of at most 900,000 euros?"
Wait for the response. The agent should find and return information about the only apartment in the dataset (
data/properties.csv) that meets these criteria!This demonstrates how the voice agent can now handle complex queries combining location (Barrio de Salamanca) and price constraints (≤ €900,000) in real-time.
🎥 Join the Live Session: Join Premium for the live walkthrough on Saturday, November 30th where we'll dive deep into the code and answer your questions!
![]() |
Miguel Otero Pedrido | Senior ML / AI Engineer Founder of The Neural Maze. Rick and Morty fan. YouTube The Neural Maze Newsletter |
![]() |
Jesús Copado | Senior ML / AI Engineer Equal parts cinema fan and AI enthusiast. YouTube |
This project is licensed under the MIT License - see the LICENSE file for details.









