Skip to content

Commit bfcfb95

Browse files
committed
call control, listen and automated language detection docs
Signed-off-by: sahil suman <sahilsuman933@gmail.com>
1 parent 3f1607e commit bfcfb95

File tree

2 files changed

+101
-1
lines changed

2 files changed

+101
-1
lines changed

calls/call-features.mdx

Lines changed: 99 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,99 @@
1+
---
2+
title: "Listen, Control, Language Detection"
3+
sidebarTitle: "Live Call Features"
4+
---
5+
6+
In this documentation, we will showcase our three new features and how you can use them:
7+
8+
1. **Call Control**: Enables dynamic injection of conversation elements during live calls.
9+
2. **Call Listen**: Provides real-time audio streaming and processing during the call.
10+
3. **Automatic Language Detection**: Detect the language in real-time conversation and talk in that particular language.
11+
12+
## Call Control and Call Listen Feature
13+
14+
When you initiate a call with the `/call` endpoint, you will receive a call ID. You can listen to the call directly via the Call Listen feature, and if you want to inject some operations into it, you can use the Call Control functionality.
15+
16+
### Call Control
17+
18+
Call Control allows you to inject conversation elements dynamically during a live call via HTTP POST requests. Currently, we support injecting messages in real-time. More operations will be supported in the future.
19+
20+
To inject a message, send a POST request in this format:
21+
22+
```bash
23+
curl -X POST https://aws-us-west-2-production3-phone-call-websocket.vapi.ai/{call_id}/control \
24+
-H "Content-Type: application/json" \
25+
-d '{
26+
"type": "say",
27+
"message": "Welcome to Vapi, this message was injected during the call."
28+
}'
29+
```
30+
31+
### Call Listen
32+
33+
Call Listen enables real-time streaming and processing of audio data using WebSocket connections. Here's an example implementation showcasing how you can receive audio packets and manipulate them based on your needs:
34+
35+
```javascript
36+
const WebSocket = require('ws');
37+
const fs = require('fs');
38+
39+
let pcmBuffer = Buffer.alloc(0);
40+
const ws = new WebSocket(`${listenUrl}/listen`);
41+
42+
ws.on('open', () => console.log('WebSocket connection established'));
43+
44+
ws.on('message', (data, isBinary) => {
45+
if (isBinary) {
46+
pcmBuffer = Buffer.concat([pcmBuffer, data]);
47+
console.log(`Received PCM data, buffer size: ${pcmBuffer.length}`);
48+
} else {
49+
console.log('Received message:', JSON.parse(data.toString()));
50+
}
51+
});
52+
53+
ws.on('close', () => {
54+
if (pcmBuffer.length > 0) {
55+
fs.writeFileSync('audio.pcm', pcmBuffer);
56+
console.log('Audio data saved to audio.pcm');
57+
}
58+
});
59+
60+
ws.on('error', (error) => console.error('WebSocket error:', error));
61+
```
62+
63+
## Automatic Language Detection
64+
65+
This feature allows you to automatically switch between languages during a call. It is currently supported only on Deepgram and supports the following languages:
66+
67+
<ul>
68+
<li>ar: Arabic</li>
69+
<li>bn: Bengali</li>
70+
<li>yue: Cantonese</li>
71+
<li>zh: Chinese</li>
72+
<li>en: English</li>
73+
<li>fr: French</li>
74+
<li>de: German</li>
75+
<li>hi: Hindi</li>
76+
<li>it: Italian</li>
77+
<li>ja: Japanese</li>
78+
<li>ko: Korean</li>
79+
<li>pt: Portuguese</li>
80+
<li>ru: Russian</li>
81+
<li>es: Spanish</li>
82+
<li>th: Thai</li>
83+
<li>vi: Vietnamese</li>
84+
</ul>
85+
86+
To enable automatic language detection for multilingual calls, set `transcriber.languageDetectionEnabled: true` through the `/assistant` API endpoint or use the assistantOverride.
87+
88+
### Requirements for Multilingual Support
89+
90+
To make multilingual support work, you need to choose the following models:
91+
92+
* **Transcriber**:
93+
* **Deepgram**: `nova-2` or `nova-2-general`
94+
95+
* **Voice Providers**:
96+
* **11labs**: Multilingual model or Turbo v2.5
97+
* **Cartesia**: `sonic-multilingual` model
98+
99+
By using these models and enabling automatic language detection, your application will be able to handle multilingual conversations seamlessly.

mint.json

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -205,7 +205,8 @@
205205
"pages": [
206206
"call-forwarding",
207207
"calls/call-ended-reason",
208-
"advanced/calls/sip"
208+
"advanced/calls/sip",
209+
"calls/call-features"
209210
]
210211
},
211212
"GHL",

0 commit comments

Comments
 (0)