You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: calls/call-features.mdx
+68-54Lines changed: 68 additions & 54 deletions
Original file line number
Diff line number
Diff line change
@@ -2,42 +2,93 @@
2
2
title: "Listen, Control, Language Detection"
3
3
sidebarTitle: "Live Call Features"
4
4
---
5
+
Vapi offers two main features that provide enhanced control over live calls:
5
6
6
-
In this documentation, we will showcase our three new features and how you can use them:
7
+
1.**Call Control**: This feature allows you to inject conversation elements dynamically during an ongoing call.
8
+
2.**Call Listen**: This feature enables real-time audio data streaming using WebSocket connections.
7
9
8
-
1.**Call Control**: Enables dynamic injection of conversation elements during live calls.
9
-
2.**Call Listen**: Provides real-time audio streaming and processing during the call.
10
-
3.**Automatic Language Detection**: Detect the language in real-time conversation and talk in that particular language.
10
+
To use these features, you first need to obtain the URLs specific to the live call. These URLs can be retrieved by triggering a `/call` endpoint, which returns the `listenUrl` and `controlUrl` within the `monitor` object.
11
11
12
-
## Call Control and Call Listen Feature
12
+
## Obtaining URLs for Call Control and Listen
13
13
14
-
When you initiate a call with the `/call` endpoint, you will receive a call ID. You can listen to the call directly via the Call Listen feature, and if you want to inject some operations into it, you can use the Call Control functionality.
14
+
To initiate a call and retrieve the `listenUrl` and `controlUrl`, send a POST request to the `/call` endpoint.
15
15
16
-
### Call Control
16
+
### Sample Request
17
17
18
-
Call Control allows you to inject conversation elements dynamically during a live call via HTTP POST requests. Currently, we support injecting messages in real-time. More operations will be supported in the future.
Once you have the `controlUrl`, you can inject a message into the live call using a POST request. This can be done by sending a JSON payload to the `controlUrl`.
62
+
63
+
### Example: Injecting a Message
21
64
22
65
```bash
23
-
curl -X POST https://aws-us-west-2-production3-phone-call-websocket.vapi.ai/{call_id}/control \
24
-
-H "Content-Type: application/json" \
25
-
-d'{
66
+
curl -X POST 'https://aws-us-west-2-production1-phone-call-websocket.vapi.ai/7420f27a-30fd-4f49-a995-5549ae7cc00d/control'
67
+
-H 'content-type: application/json'
68
+
--data-raw'{
26
69
"type": "say",
27
70
"message": "Welcome to Vapi, this message was injected during the call."
28
71
}'
72
+
29
73
```
30
74
31
-
### Call Listen
75
+
The message will be spoken in real-time during the ongoing call.
76
+
77
+
## Call Listen Feature
78
+
79
+
The `listenUrl` allows you to connect to a WebSocket and stream the audio data in real-time. You can either process the audio directly or save the binary data to analyze or replay later.
32
80
33
-
Call Listen enables real-time streaming and processing of audio data using WebSocket connections. Here's an example implementation showcasing how you can receive audio packets and manipulate them based on your needs:
81
+
### Example: Saving Audio Data from a Live Call
34
82
35
-
```javascript
83
+
Here is a simple implementation for saving the audio buffer from a live call using Node.js:
This feature allows you to automatically switch between languages during a call. It is currently supported only on Deepgram and supports the following languages:
66
-
67
-
<ul>
68
-
<li>ar: Arabic</li>
69
-
<li>bn: Bengali</li>
70
-
<li>yue: Cantonese</li>
71
-
<li>zh: Chinese</li>
72
-
<li>en: English</li>
73
-
<li>fr: French</li>
74
-
<li>de: German</li>
75
-
<li>hi: Hindi</li>
76
-
<li>it: Italian</li>
77
-
<li>ja: Japanese</li>
78
-
<li>ko: Korean</li>
79
-
<li>pt: Portuguese</li>
80
-
<li>ru: Russian</li>
81
-
<li>es: Spanish</li>
82
-
<li>th: Thai</li>
83
-
<li>vi: Vietnamese</li>
84
-
</ul>
85
112
86
-
To enable automatic language detection for multilingual calls, set `transcriber.languageDetectionEnabled: true` through the `/assistant` API endpoint or use the assistantOverride.
87
-
88
-
### Requirements for Multilingual Support
89
-
90
-
To make multilingual support work, you need to choose the following models:
91
-
92
-
***Transcriber**:
93
-
***Deepgram**: `nova-2` or `nova-2-general`
94
-
95
-
***Voice Providers**:
96
-
***11labs**: Multilingual model or Turbo v2.5
97
-
***Cartesia**: `sonic-multilingual` model
98
-
99
-
By using these models and enabling automatic language detection, your application will be able to handle multilingual conversations seamlessly.
0 commit comments