Skip to content

Commit cf43f5a

Browse files
committed
feat: #1760 Add SIP support for realtime agent runner
1 parent a30c32e commit cf43f5a

File tree

8 files changed

+421
-1
lines changed

8 files changed

+421
-1
lines changed
Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
# Twilio SIP Realtime Example
2+
3+
This example shows how to handle OpenAI Realtime SIP calls with the Agents SDK. Incoming calls are accepted through the Realtime Calls API, a triage agent answers with a fixed greeting, and handoffs route the caller to specialist agents (FAQ lookup and record updates) similar to the realtime UI demo.
4+
5+
## Prerequisites
6+
7+
- Python 3.9+
8+
- An OpenAI API key with Realtime API access
9+
- A configured webhook secret for your OpenAI project
10+
- A Twilio account with a phone number and Elastic SIP Trunking enabled
11+
- A public HTTPS endpoint for local development (for example, [ngrok](https://ngrok.com/))
12+
13+
## Configure OpenAI
14+
15+
1. In [platform settings](https://platform.openai.com/settings) select your project.
16+
2. Create a webhook pointing to `https://<your-public-host>/openai/webhook` with "realtime.call.incoming" event type and note the signing secret. The example verifies each webhook with `OPENAI_WEBHOOK_SECRET`.
17+
18+
## Configure Twilio Elastic SIP Trunking
19+
20+
1. Create (or edit) an Elastic SIP trunk.
21+
2. On the **Origination** tab, add an origination SIP URI of `sip:proj_<your_project_id>@sip.api.openai.com;transport=tls` so Twilio sends inbound calls to OpenAI. (The Termination tab always ends with `.pstn.twilio.com`, so leave it unchanged.)
22+
3. Add at least one phone number to the trunk so inbound calls are forwarded to OpenAI.
23+
24+
## Setup
25+
26+
1. Install dependencies:
27+
```bash
28+
uv pip install -r examples/realtime/twilio-sip/requirements.txt
29+
```
30+
2. Export required environment variables:
31+
```bash
32+
export OPENAI_API_KEY="sk-..."
33+
export OPENAI_WEBHOOK_SECRET="whsec_..."
34+
```
35+
3. (Optional) Adjust the multi-agent logic in `examples/realtime/twilio_sip/agents.py` if you want
36+
to change the specialist agents or tools.
37+
4. Run the FastAPI server:
38+
```bash
39+
uv run uvicorn examples.realtime.twilio_sip.server:app --host 0.0.0.0 --port 8000
40+
```
41+
5. Expose the server publicly (example with ngrok):
42+
```bash
43+
ngrok http 8000
44+
```
45+
46+
## Test a Call
47+
48+
1. Place a call to the Twilio number attached to the SIP trunk.
49+
2. Twilio sends the call to `sip.api.openai.com`; OpenAI fires `realtime.call.incoming`, which this example accepts.
50+
3. The triage agent greets the caller, then either keeps the conversation or hands off to:
51+
- **FAQ Agent** – answers common questions via `faq_lookup_tool`.
52+
- **Records Agent** – writes short notes using `update_customer_record`.
53+
4. The background task attaches to the call and logs transcripts plus basic events in the console.
54+
55+
You can edit `server.py` to change instructions, add tools, or integrate with internal systems once the SIP session is active.
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
"""OpenAI Realtime SIP example package."""
Lines changed: 87 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,87 @@
1+
"""Realtime agent definitions shared by the Twilio SIP example."""
2+
3+
from __future__ import annotations
4+
5+
import asyncio
6+
7+
from agents import function_tool
8+
from agents.extensions.handoff_prompt import RECOMMENDED_PROMPT_PREFIX
9+
from agents.realtime import RealtimeAgent, realtime_handoff
10+
11+
# --- Tools -----------------------------------------------------------------
12+
13+
14+
WELCOME_MESSAGE = "Hello, this is ABC customer service. How can I help you today?"
15+
16+
17+
@function_tool(
18+
name_override="faq_lookup_tool", description_override="Lookup frequently asked questions."
19+
)
20+
async def faq_lookup_tool(question: str) -> str:
21+
"""Fetch FAQ answers for the caller."""
22+
23+
await asyncio.sleep(3)
24+
25+
q = question.lower()
26+
if "plan" in q or "wifi" in q or "wi-fi" in q:
27+
return "We provide complimentary Wi-Fi. Join the ABC-Customer network." # demo data
28+
if "billing" in q or "invoice" in q:
29+
return "Your latest invoice is available in the ABC portal under Billing > History."
30+
if "hours" in q or "support" in q:
31+
return "Human support agents are available 24/7; transfer to the specialist if needed."
32+
return "I'm not sure about that. Let me transfer you back to the triage agent."
33+
34+
35+
@function_tool
36+
async def update_customer_record(customer_id: str, note: str) -> str:
37+
"""Record a short note about the caller."""
38+
39+
await asyncio.sleep(1)
40+
return f"Recorded note for {customer_id}: {note}"
41+
42+
43+
# --- Agents ----------------------------------------------------------------
44+
45+
46+
faq_agent = RealtimeAgent(
47+
name="FAQ Agent",
48+
handoff_description="Handles frequently asked questions and general account inquiries.",
49+
instructions=f"""{RECOMMENDED_PROMPT_PREFIX}
50+
You are an FAQ specialist. Always rely on the faq_lookup_tool for answers and keep replies
51+
concise. If the caller needs hands-on help, transfer back to the triage agent.
52+
""",
53+
tools=[faq_lookup_tool],
54+
)
55+
56+
records_agent = RealtimeAgent(
57+
name="Records Agent",
58+
handoff_description="Updates customer records with brief notes and confirmation numbers.",
59+
instructions=f"""{RECOMMENDED_PROMPT_PREFIX}
60+
You handle structured updates. Confirm the customer's ID, capture their request in a short
61+
note, and use the update_customer_record tool. For anything outside data updates, return to the
62+
triage agent.
63+
""",
64+
tools=[update_customer_record],
65+
)
66+
67+
triage_agent = RealtimeAgent(
68+
name="Triage Agent",
69+
handoff_description="Greets callers and routes them to the most appropriate specialist.",
70+
instructions=(
71+
f"{RECOMMENDED_PROMPT_PREFIX} "
72+
"Always begin the call by saying exactly: '"
73+
f"{WELCOME_MESSAGE}' "
74+
"before collecting details. Once the greeting is complete, gather context and hand off to "
75+
"the FAQ or Records agents when appropriate."
76+
),
77+
handoffs=[faq_agent, realtime_handoff(records_agent)],
78+
)
79+
80+
faq_agent.handoffs.append(triage_agent)
81+
records_agent.handoffs.append(triage_agent)
82+
83+
84+
def get_starting_agent() -> RealtimeAgent:
85+
"""Return the agent used to start each realtime call."""
86+
87+
return triage_agent
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
fastapi>=0.120.0
2+
openai>=2.2,<3
3+
uvicorn[standard]>=0.38.0
Lines changed: 195 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,195 @@
1+
"""Minimal FastAPI server for handling OpenAI Realtime SIP calls with Twilio."""
2+
3+
from __future__ import annotations
4+
5+
import asyncio
6+
import logging
7+
import os
8+
9+
import websockets
10+
from fastapi import FastAPI, HTTPException, Request, Response
11+
from openai import APIStatusError, AsyncOpenAI, InvalidWebhookSignatureError
12+
13+
from agents.realtime.items import (
14+
AssistantAudio,
15+
AssistantMessageItem,
16+
AssistantText,
17+
InputText,
18+
UserMessageItem,
19+
)
20+
from agents.realtime.model_inputs import RealtimeModelSendRawMessage
21+
from agents.realtime.openai_realtime import OpenAIRealtimeSIPModel
22+
from agents.realtime.runner import RealtimeRunner
23+
24+
from .agents import WELCOME_MESSAGE, get_starting_agent
25+
26+
logging.basicConfig(level=logging.INFO)
27+
28+
logger = logging.getLogger("twilio_sip_example")
29+
30+
31+
def _get_env(name: str) -> str:
32+
value = os.getenv(name)
33+
if not value:
34+
raise RuntimeError(f"Missing environment variable: {name}")
35+
return value
36+
37+
38+
OPENAI_API_KEY = _get_env("OPENAI_API_KEY")
39+
OPENAI_WEBHOOK_SECRET = _get_env("OPENAI_WEBHOOK_SECRET")
40+
41+
client = AsyncOpenAI(api_key=OPENAI_API_KEY, webhook_secret=OPENAI_WEBHOOK_SECRET)
42+
43+
# Build the multi-agent graph (triage + specialist agents) from agents.py.
44+
assistant_agent = get_starting_agent()
45+
46+
app = FastAPI()
47+
48+
# Track background tasks so repeated webhooks do not spawn duplicates.
49+
active_call_tasks: dict[str, asyncio.Task[None]] = {}
50+
51+
52+
async def accept_call(call_id: str) -> None:
53+
"""Accept the incoming SIP call and configure the realtime session."""
54+
55+
# The starting agent uses static instructions, so we can forward them directly to the accept
56+
# call payload. If someone swaps in a dynamic prompt, fall back to a sensible default.
57+
instructions_payload = (
58+
assistant_agent.instructions
59+
if isinstance(assistant_agent.instructions, str)
60+
else "You are a helpful triage agent for ABC customer service."
61+
)
62+
63+
try:
64+
# AsyncOpenAI does not yet expose high-level helpers like client.realtime.calls.accept, so
65+
# we call the REST endpoint directly via client.post(). Keep this until the SDK grows an
66+
# async helper.
67+
await client.post(
68+
f"/realtime/calls/{call_id}/accept",
69+
body={
70+
"type": "realtime",
71+
"model": "gpt-realtime",
72+
"instructions": instructions_payload,
73+
},
74+
cast_to=dict,
75+
)
76+
except APIStatusError as exc:
77+
if exc.status_code == 404:
78+
# Twilio occasionally retries webhooks after the caller hangs up; treat as a no-op so
79+
# the webhook still returns 200.
80+
logger.warning(
81+
"Call %s no longer exists when attempting accept (404). Skipping.", call_id
82+
)
83+
return
84+
85+
detail = exc.message
86+
if exc.response is not None:
87+
try:
88+
detail = exc.response.text
89+
except Exception: # noqa: BLE001
90+
detail = str(exc.response)
91+
92+
logger.error(
93+
"Failed to accept call %s: %s %s", call_id, exc.status_code, detail
94+
)
95+
raise HTTPException(status_code=500, detail="Failed to accept call") from exc
96+
97+
logger.info("Accepted call %s", call_id)
98+
99+
100+
async def observe_call(call_id: str) -> None:
101+
"""Attach to the realtime session and log conversation events."""
102+
103+
runner = RealtimeRunner(assistant_agent, model=OpenAIRealtimeSIPModel())
104+
105+
try:
106+
initial_settings = {
107+
"turn_detection": {
108+
"type": "semantic_vad",
109+
"interrupt_response": True,
110+
}
111+
}
112+
113+
async with await runner.run(
114+
model_config={
115+
"call_id": call_id,
116+
"initial_model_settings": initial_settings,
117+
}
118+
) as session:
119+
# Trigger an initial greeting so callers hear the agent right away.
120+
# Issue a response.create immediately after the WebSocket attaches so the model speaks
121+
# before the caller says anything. Using the raw client message ensures zero latency
122+
# and avoids threading the greeting through history.
123+
await session.model.send_event(
124+
RealtimeModelSendRawMessage(
125+
message={
126+
"type": "response.create",
127+
"response": {
128+
"instructions": (
129+
"Say exactly '"
130+
f"{WELCOME_MESSAGE}"
131+
"' now before continuing the conversation."
132+
)
133+
},
134+
}
135+
)
136+
)
137+
138+
async for event in session:
139+
if event.type == "history_added":
140+
item = event.item
141+
if isinstance(item, UserMessageItem):
142+
for content in item.content:
143+
if isinstance(content, InputText) and content.text:
144+
logger.info("Caller: %s", content.text)
145+
elif isinstance(item, AssistantMessageItem):
146+
for content in item.content:
147+
if isinstance(content, AssistantText) and content.text:
148+
logger.info("Assistant (text): %s", content.text)
149+
elif isinstance(content, AssistantAudio) and content.transcript:
150+
logger.info("Assistant (audio transcript): %s", content.transcript)
151+
elif event.type == "error":
152+
logger.error("Realtime session error: %s", event.error)
153+
154+
except websockets.exceptions.ConnectionClosedError:
155+
# Callers hanging up causes the WebSocket to close without a frame; log at info level so it
156+
# does not surface as an error.
157+
logger.info("Realtime WebSocket closed for call %s", call_id)
158+
except Exception as exc: # noqa: BLE001 - demo logging only
159+
logger.exception("Error while observing call %s", call_id, exc_info=exc)
160+
finally:
161+
logger.info("Call %s ended", call_id)
162+
active_call_tasks.pop(call_id, None)
163+
164+
165+
def _track_call_task(call_id: str) -> None:
166+
existing = active_call_tasks.get(call_id)
167+
if existing and not existing.done():
168+
existing.cancel()
169+
170+
task = asyncio.create_task(observe_call(call_id))
171+
active_call_tasks[call_id] = task
172+
173+
174+
@app.post("/openai/webhook")
175+
async def openai_webhook(request: Request) -> Response:
176+
body = await request.body()
177+
178+
try:
179+
event = client.webhooks.unwrap(body, request.headers)
180+
except InvalidWebhookSignatureError as exc:
181+
raise HTTPException(status_code=400, detail="Invalid webhook signature") from exc
182+
183+
if event.type == "realtime.call.incoming":
184+
call_id = event.data.call_id
185+
await accept_call(call_id)
186+
_track_call_task(call_id)
187+
return Response(status_code=200)
188+
189+
# Ignore other webhook event types for brevity.
190+
return Response(status_code=200)
191+
192+
193+
@app.get("/")
194+
async def healthcheck() -> dict[str, str]:
195+
return {"status": "ok"}

src/agents/realtime/model.py

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -139,6 +139,13 @@ class RealtimeModelConfig(TypedDict):
139139
is played to the user.
140140
"""
141141

142+
call_id: NotRequired[str]
143+
"""Attach to an existing realtime call instead of creating a new session.
144+
145+
When provided, the transport connects using the `call_id` query string parameter rather than a
146+
model name. This is used for SIP-originated calls that are accepted via the Realtime Calls API.
147+
"""
148+
142149

143150
class RealtimeModel(abc.ABC):
144151
"""Interface for connecting to a realtime model and sending/receiving events."""

src/agents/realtime/openai_realtime.py

Lines changed: 17 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -216,7 +216,11 @@ async def connect(self, options: RealtimeModelConfig) -> None:
216216
else:
217217
self._tracing_config = "auto"
218218

219-
url = options.get("url", f"wss://api.openai.com/v1/realtime?model={self.model}")
219+
call_id = options.get("call_id")
220+
if call_id:
221+
url = options.get("url", f"wss://api.openai.com/v1/realtime?call_id={call_id}")
222+
else:
223+
url = options.get("url", f"wss://api.openai.com/v1/realtime?model={self.model}")
220224

221225
headers: dict[str, str] = {}
222226
if options.get("headers") is not None:
@@ -929,6 +933,18 @@ def _tools_to_session_tools(
929933
return converted_tools
930934

931935

936+
class OpenAIRealtimeSIPModel(OpenAIRealtimeWebSocketModel):
937+
"""Realtime model that attaches to SIP-originated calls using a call ID."""
938+
939+
async def connect(self, options: RealtimeModelConfig) -> None: # type: ignore[override]
940+
call_id = options.get("call_id")
941+
if not call_id:
942+
raise UserError("OpenAIRealtimeSIPModel requires `call_id` in the model configuration.")
943+
944+
sip_options = options.copy()
945+
await super().connect(sip_options)
946+
947+
932948
class _ConversionHelper:
933949
@classmethod
934950
def conversation_item_to_realtime_message_item(

0 commit comments

Comments
 (0)