Skip to content

Latency doubles in telephony contexts #3685

@Rickaym

Description

@Rickaym

Hi,

First of all, Livekit is great. I have been using it for over a year now to create web-based voice AI agents. More recently, I started applying livekit agents to telephony, to answer an inbound call at a Twilio number, for instance. And this is where latency becomes quite unbearable. Part of my question is how I can find the bottleneck/what I can do to reduce it, and another is if this is livekit-specific. Below is a comparison I made between the Engress recording and the microphone recording of the call. I know this is extremely rough, but this was easiest to do, and the latency difference is quite apparent. The code I used here is a prompt modified version of the starter example.

Based on the image below, what's seems apparent to me is telephony latency, that is, after the STT->LLM->TTS pipeline has ended, the audio which gets back to the phone (and vice versa) is causing the latency issue. When I compared the latency graphs (web vs telephony) of something like vapi, it was absolutely on-point. This was confusing to me.

I appreciate any thoughts here.

(first is the engress recording, second is the microphone recording)
Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions