-
Couldn't load subscription status.
- Fork 8
[Agents] Adding agent websocket for web calls #172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
🌿 Preview your docs: https://cartesia-preview-0f4af38c-9c95-4c4b-ae89-d5ef707bf515.docs.buildwithfern.com |
|
🌿 Preview your docs: https://cartesia-preview-da4fa7bb-83d0-4be7-ad72-74a23d784901.docs.buildwithfern.com |
|
🌿 Preview your docs: https://cartesia-preview-23a49fa7-3d9b-42f0-9cbd-c0cc8a80eadb.docs.buildwithfern.com |
|
🌿 Preview your docs: https://cartesia-preview-5c557624-82a6-44dd-999b-80e7a5591bf7.docs.buildwithfern.com |
|
🌿 Preview your docs: https://cartesia-preview-630ce099-b79b-4204-b002-b5492f379e1a.docs.buildwithfern.com |
|
🌿 Preview your docs: https://cartesia-preview-eb3ecc66-3501-409c-a917-ce8e9d4e41cb.docs.buildwithfern.com |
|
🌿 Preview your docs: https://cartesia-preview-044d908f-ae6a-4686-ab11-95b3f95ceda7.docs.buildwithfern.com |
|
🌿 Preview your docs: https://cartesia-preview-4da55988-22b8-4a60-9b8e-7ea50ef7b9ea.docs.buildwithfern.com |
Co-authored-by: Sauhard Jain <sauhardjain03@gmail.com>
|
🌿 Preview your docs: https://cartesia-preview-9474452e-1084-4d61-b52f-f7382758267f.docs.buildwithfern.com |
Co-authored-by: Sauhard Jain <sauhardjain03@gmail.com>
Co-authored-by: Sauhard Jain <sauhardjain03@gmail.com>
Co-authored-by: Sauhard Jain <sauhardjain03@gmail.com>
Co-authored-by: Sauhard Jain <sauhardjain03@gmail.com>
|
🌿 Preview your docs: https://cartesia-preview-ed0641b4-cf6a-47a3-8e19-2555ce94a389.docs.buildwithfern.com |
|
🌿 Preview your docs: https://cartesia-preview-f54db558-0125-449b-bcbf-72fc86ec53b6.docs.buildwithfern.com |
Co-authored-by: Sauhard Jain <sauhardjain03@gmail.com>
|
🌿 Preview your docs: https://cartesia-preview-e3cfd3b2-daf1-4382-b43d-2b5ae80e48ac.docs.buildwithfern.com |
|
🌿 Preview your docs: https://cartesia-preview-725197a7-3bc3-414e-9b99-aebf791009ec.docs.buildwithfern.com |
Co-authored-by: Sauhard Jain <sauhardjain03@gmail.com>
Co-authored-by: Sauhard Jain <sauhardjain03@gmail.com>
Co-authored-by: Sauhard Jain <sauhardjain03@gmail.com>
|
🌿 Preview your docs: https://cartesia-preview-b3aefae8-c40c-45eb-a9a7-bebc7dd9f602.docs.buildwithfern.com |
Co-authored-by: Sauhard Jain <sauhardjain03@gmail.com>
|
🌿 Preview your docs: https://cartesia-preview-cd7cd9ef-e9ce-438f-92ab-740ecad6e33a.docs.buildwithfern.com |
Co-authored-by: Sauhard Jain <sauhardjain03@gmail.com>
|
🌿 Preview your docs: https://cartesia-preview-5f7639dd-09a2-4373-a192-51179a151d85.docs.buildwithfern.com |
|
🌿 Preview your docs: https://cartesia-preview-043760c9-9acb-429f-a5f3-ad25493c9242.docs.buildwithfern.com |
|
🌿 Preview your docs: https://cartesia-preview-5c13854b-0710-444e-bc1c-4a70adc9a8c8.docs.buildwithfern.com |
|
🌿 Preview your docs: https://cartesia-preview-1e357de7-0ca8-4b34-9b69-066674ccda34.docs.buildwithfern.com |
|
🌿 Preview your docs: https://cartesia-preview-1a59ad3e-21ed-4060-9801-11771b372a3e.docs.buildwithfern.com |
|
🌿 Preview your docs: https://cartesia-preview-c8d3ac75-1858-48ee-ad35-e423c86b265d.docs.buildwithfern.com |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also add an API reference for this?
|
|
||
| | Header | Value | | ||
| |--------|-------| | ||
| | `Authorization` | `Bearer {your_api_key}` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Token, not API key.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh i thought it could be either - i'll update tho.
|
|
||
| ### Custom Event | ||
|
|
||
| Sends custom metadata to the agent. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably should explain how this shows up to agent code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heh it does not yet show up, we have a ticket for this outstanding
| ## Best Practices | ||
|
|
||
| 1. **Always send start event first** - The connection will be closed if any other event is sent before start | ||
| 2. **Use appropriate audio formats** - Match your input format to your audio source capabilities. For telephony providers this is often `mulaw_8000`, while for web clients this will often be `pcm_44000` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO we should recommend 16k
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@noahlt could you elaborate on the 16k recommendation? Why is it better and for which use case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@noahlt the input gets resampled to 16k by default in our Pipecat pipeline since that's what our STT takes, independent of what the input Transport takes.
But right now input Transport also informs our output (we have a ticket to fit this)
| 1. **Always send start event first** - The connection will be closed if any other event is sent before start | ||
| 2. **Use appropriate audio formats** - Match your input format to your audio source capabilities. For telephony providers this is often `mulaw_8000`, while for web clients this will often be `pcm_44000` | ||
| 3. **Handle connection close gracefully** - Monitor close events and reasons for debugging | ||
| 4. **Implement keepalive for calls with longer periods of silence** - Send WebSocket ping frames every 20-25 seconds to prevent the 30-second inactivity timeout during periods of silence | ||
| 5. Send your own stream_id's for the best observability | ||
| 6. Always handle timeout closures (`1000 / connection idle timeout`) by reconnecting and resending a `start` event. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| 1. **Always send start event first** - The connection will be closed if any other event is sent before start | |
| 2. **Use appropriate audio formats** - Match your input format to your audio source capabilities. For telephony providers this is often `mulaw_8000`, while for web clients this will often be `pcm_44000` | |
| 3. **Handle connection close gracefully** - Monitor close events and reasons for debugging | |
| 4. **Implement keepalive for calls with longer periods of silence** - Send WebSocket ping frames every 20-25 seconds to prevent the 30-second inactivity timeout during periods of silence | |
| 5. Send your own stream_id's for the best observability | |
| 6. Always handle timeout closures (`1000 / connection idle timeout`) by reconnecting and resending a `start` event. | |
| 1. **Send `start` first** — The connection closes if any other event is sent before `start`. | |
| 1. **Choose the right audio format** — Match the format to your source: `mulaw_8000` for telephony, `pcm_44100` for web clients. | |
| 1. **Handle closes cleanly** — Always capture close codes and reasons for debugging and recovery. | |
| 1. **Keep the connection alive** — Send WebSocket ping frames every 20–25 seconds to avoid the 30-second inactivity timeout. | |
| 1. **Manage stream IDs** — Provide your own `stream_id` values to improve observability across systems. | |
| 1. **Recover from idle timeouts** — On `1000 / connection idle timeout`, reconnect and resend a `start` event. |
| // JavaScript example | ||
| setInterval(() => { | ||
| if (websocket.readyState === WebSocket.OPEN) { | ||
| websocket.ping(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
websocket.ping() isn't supported on browsers, so we should make this example specific to Node.js
| } | ||
| }, 20000); // Send ping every 20 seconds | ||
| ``` | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add browser-specific example:
// Browser example: send a custom keepalive event
setInterval(() => {
if (websocket.readyState === WebSocket.OPEN) {
websocket.send(JSON.stringify({
event: "custom",
stream_id: "unique_id",
metadata: { type: "heartbeat" }
}));
}
}, 20000); // every 20s to avoid 30s idle timeout
| if (websocket.readyState === WebSocket.OPEN) { | ||
| websocket.ping(); | ||
| } | ||
| }, 20000); // Send ping every 20 seconds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| }, 20000); // Send ping every 20 seconds | |
| }, 20000); // every 20s to avoid 30s idle timeout |
|
|
||
| ### Inactivity Timeout | ||
|
|
||
| The server automatically closes idle WebSocket connections after **30 seconds** of inactivity. Activity is defined as receiving any message from the client, including: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| The server automatically closes idle WebSocket connections after **30 seconds** of inactivity. Activity is defined as receiving any message from the client, including: | |
| The server closes idle WebSocket connections after **30 seconds** without client activity. Any client message counts as activity, including: |
|
|
||
| ### Ping/Pong Keepalive | ||
|
|
||
| To prevent inactivity timeouts during periods of silence, use standard WebSocket ping frames for periodic keepalive: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| To prevent inactivity timeouts during periods of silence, use standard WebSocket ping frames for periodic keepalive: | |
| Send periodic WebSocket ping frames to keep the connection alive: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
periods of silence is misleading
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sauhardjain i think the issue is we don't want folks to keep this alive indefinitely without reason. Although we'll get paid (lol) they'll probably get pissed
Overview
Some Cartesia users are interested in integrating their agents with their websites, rather than with telephony. For these folks, we're outlining a more generalized websocket so they can handle the events outputted by an agent on their own and pass in their own audio events.
Additions
Adding a new page under
integrations(open to debate around placement here) for web calls.