Skip to content

Commit 9737ad1

Browse files
committed
Nebius integration fixed
1 parent ebd4f2a commit 9737ad1

File tree

6 files changed

+272
-32
lines changed

6 files changed

+272
-32
lines changed

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,7 @@ You can connect to Pipecat from any platform using our official SDKs:
5454
| Category | Services |
5555
| ------------------- ||
5656
| Speech-to-Text | [AssemblyAI](https://docs.pipecat.ai/server/services/stt/assemblyai), [AWS](https://docs.pipecat.ai/server/services/stt/aws), [Azure](https://docs.pipecat.ai/server/services/stt/azure), [Cartesia](https://docs.pipecat.ai/server/services/stt/cartesia), [Deepgram](https://docs.pipecat.ai/server/services/stt/deepgram), [Fal Wizper](https://docs.pipecat.ai/server/services/stt/fal), [Gladia](https://docs.pipecat.ai/server/services/stt/gladia), [Google](https://docs.pipecat.ai/server/services/stt/google), [Groq (Whisper)](https://docs.pipecat.ai/server/services/stt/groq), [NVIDIA Riva](https://docs.pipecat.ai/server/services/stt/riva), [OpenAI (Whisper)](https://docs.pipecat.ai/server/services/stt/openai), [SambaNova (Whisper)](https://docs.pipecat.ai/server/services/stt/sambanova), [Soniox](https://docs.pipecat.ai/server/services/stt/soniox), [Speechmatics](https://docs.pipecat.ai/server/services/stt/speechmatics), [Ultravox](https://docs.pipecat.ai/server/services/stt/ultravox), [Whisper](https://docs.pipecat.ai/server/services/stt/whisper) |
57-
| LLMs | [Anthropic](https://docs.pipecat.ai/server/services/llm/anthropic), [AWS](https://docs.pipecat.ai/server/services/llm/aws), [Azure](https://docs.pipecat.ai/server/services/llm/azure), [Cerebras](https://docs.pipecat.ai/server/services/llm/cerebras), [DeepSeek](https://docs.pipecat.ai/server/services/llm/deepseek), [Fireworks AI](https://docs.pipecat.ai/server/services/llm/fireworks), [Gemini](https://docs.pipecat.ai/server/services/llm/gemini), [Grok](https://docs.pipecat.ai/server/services/llm/grok), [Groq](https://docs.pipecat.ai/server/services/llm/groq), [Mistral](https://docs.pipecat.ai/server/services/llm/mistral), [NVIDIA NIM](https://docs.pipecat.ai/server/services/llm/nim), [Ollama](https://docs.pipecat.ai/server/services/llm/ollama), [OpenAI](https://docs.pipecat.ai/server/services/llm/openai), [OpenRouter](https://docs.pipecat.ai/server/services/llm/openrouter), [Perplexity](https://docs.pipecat.ai/server/services/llm/perplexity), [Qwen](https://docs.pipecat.ai/server/services/llm/qwen), [SambaNova](https://docs.pipecat.ai/server/services/llm/sambanova) [Together AI](https://docs.pipecat.ai/server/services/llm/together) |
57+
| LLMs | [Anthropic](https://docs.pipecat.ai/server/services/llm/anthropic), [AWS](https://docs.pipecat.ai/server/services/llm/aws), [Azure](https://docs.pipecat.ai/server/services/llm/azure), [Cerebras](https://docs.pipecat.ai/server/services/llm/cerebras), [DeepSeek](https://docs.pipecat.ai/server/services/llm/deepseek), [Fireworks AI](https://docs.pipecat.ai/server/services/llm/fireworks), [Gemini](https://docs.pipecat.ai/server/services/llm/gemini), [Grok](https://docs.pipecat.ai/server/services/llm/grok), [Groq](https://docs.pipecat.ai/server/services/llm/groq), [Mistral](https://docs.pipecat.ai/server/services/llm/mistral), [Nebius](https://docs.pipecat.ai/server/services/llm/nebius), [NVIDIA NIM](https://docs.pipecat.ai/server/services/llm/nim), [Ollama](https://docs.pipecat.ai/server/services/llm/ollama), [OpenAI](https://docs.pipecat.ai/server/services/llm/openai), [OpenRouter](https://docs.pipecat.ai/server/services/llm/openrouter), [Perplexity](https://docs.pipecat.ai/server/services/llm/perplexity), [Qwen](https://docs.pipecat.ai/server/services/llm/qwen), [SambaNova](https://docs.pipecat.ai/server/services/llm/sambanova) [Together AI](https://docs.pipecat.ai/server/services/llm/together) |
5858
| Text-to-Speech | [Async](https://docs.pipecat.ai/server/services/tts/asyncai), [AWS](https://docs.pipecat.ai/server/services/tts/aws), [Azure](https://docs.pipecat.ai/server/services/tts/azure), [Cartesia](https://docs.pipecat.ai/server/services/tts/cartesia), [Deepgram](https://docs.pipecat.ai/server/services/tts/deepgram), [ElevenLabs](https://docs.pipecat.ai/server/services/tts/elevenlabs), [Fish](https://docs.pipecat.ai/server/services/tts/fish), [Google](https://docs.pipecat.ai/server/services/tts/google), [Groq](https://docs.pipecat.ai/server/services/tts/groq), [Inworld](https://docs.pipecat.ai/server/services/tts/inworld), [LMNT](https://docs.pipecat.ai/server/services/tts/lmnt), [MiniMax](https://docs.pipecat.ai/server/services/tts/minimax), [Neuphonic](https://docs.pipecat.ai/server/services/tts/neuphonic), [NVIDIA Riva](https://docs.pipecat.ai/server/services/tts/riva), [OpenAI](https://docs.pipecat.ai/server/services/tts/openai), [Piper](https://docs.pipecat.ai/server/services/tts/piper), [PlayHT](https://docs.pipecat.ai/server/services/tts/playht), [Rime](https://docs.pipecat.ai/server/services/tts/rime), [Sarvam](https://docs.pipecat.ai/server/services/tts/sarvam), [XTTS](https://docs.pipecat.ai/server/services/tts/xtts) |
5959
| Speech-to-Speech | [AWS Nova Sonic](https://docs.pipecat.ai/server/services/s2s/aws), [Gemini Multimodal Live](https://docs.pipecat.ai/server/services/s2s/gemini), [OpenAI Realtime](https://docs.pipecat.ai/server/services/s2s/openai) |
6060
| Transport | [Daily (WebRTC)](https://docs.pipecat.ai/server/services/transport/daily), [FastAPI Websocket](https://docs.pipecat.ai/server/services/transport/fastapi-websocket), [SmallWebRTCTransport](https://docs.pipecat.ai/server/services/transport/small-webrtc), [WebSocket Server](https://docs.pipecat.ai/server/services/transport/websocket-server), Local |
@@ -238,4 +238,4 @@ We aim to review all contributions promptly and provide constructive feedback to
238238

239239
➡️ [Read the docs](https://docs.pipecat.ai)
240240

241-
➡️ [Reach us on X](https://x.com/pipecat_ai)
241+
➡️ [Reach us on X](https://x.com/pipecat_ai)
Lines changed: 210 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,210 @@
1+
---
2+
title: "Nebius"
3+
description: "LLM service implementation using Nebius AI Studio's API with OpenAI-compatible interface"
4+
---
5+
6+
## Overview
7+
8+
`NebiusLLMService` provides access to Nebius AI Studio's language models through an OpenAI-compatible interface. It inherits from `OpenAILLMService` and supports streaming responses, function calling, and context management.
9+
10+
<CardGroup cols={3}>
11+
<Card
12+
title="API Reference"
13+
icon="code"
14+
href="https://reference-server.pipecat.ai/en/latest/api/pipecat.services.nebius.llm.html"
15+
>
16+
Complete API documentation and method details
17+
</Card>
18+
<Card
19+
title="Nebius Docs"
20+
icon="book"
21+
href="https://docs.nebius.ai/ai/api/v1/llm/api-reference"
22+
>
23+
Official Nebius AI Studio API documentation and features
24+
</Card>
25+
<Card
26+
title="Example Code"
27+
icon="play"
28+
href="https://github.com/pipecat-ai/pipecat/blob/main/examples/foundational/14x-function-calling-nebius.py"
29+
>
30+
Working example with function calling
31+
</Card>
32+
</CardGroup>
33+
34+
## Installation
35+
36+
To use Nebius services, install the required dependency:
37+
38+
```bash
39+
pip install "pipecat-ai[nebius]"
40+
```
41+
42+
You'll also need to set up your Nebius API key as an environment variable: `NEBIUS_API_KEY`.
43+
44+
<Tip>
45+
Get your API key from [Nebius AI Studio Console](https://studio.nebius.ai/).
46+
</Tip>
47+
48+
## Frames
49+
50+
### Input
51+
52+
- `OpenAILLMContextFrame` - Conversation context and history
53+
- `LLMMessagesFrame` - Direct message list
54+
- `VisionImageRawFrame` - Images for vision processing (select models)
55+
- `LLMUpdateSettingsFrame` - Runtime parameter updates
56+
57+
### Output
58+
59+
- `LLMFullResponseStartFrame` / `LLMFullResponseEndFrame` - Response boundaries
60+
- `LLMTextFrame` - Streamed completion chunks
61+
- `FunctionCallInProgressFrame` / `FunctionCallResultFrame` - Function call lifecycle
62+
- `ErrorFrame` - API or processing errors
63+
64+
## Function Calling
65+
66+
<Card
67+
title="Function Calling Guide"
68+
icon="function"
69+
href="/learn/function-calling"
70+
>
71+
Learn how to implement function calling with standardized schemas, register
72+
handlers, manage context properly, and control execution flow in your
73+
conversational AI applications.
74+
</Card>
75+
76+
## Context Management
77+
78+
<Card
79+
title="Context Management Guide"
80+
icon="messages"
81+
href="/learn/context-management"
82+
>
83+
Learn how to manage conversation context, handle message history, and
84+
integrate context aggregators for consistent conversational experiences.
85+
</Card>
86+
87+
## Usage Example
88+
89+
```python
90+
import os
91+
from pipecat.services.nebius.llm import NebiusLLMService
92+
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
93+
from pipecat.adapters.schemas.function_schema import FunctionSchema
94+
from pipecat.adapters.schemas.tools_schema import ToolsSchema
95+
96+
# Configure Nebius service with default model
97+
llm = NebiusLLMService(
98+
api_key=os.getenv("NEBIUS_API_KEY"),
99+
model="meta-llama/Meta-Llama-3.1-8B-Instruct-fast", # Default fast model
100+
params=NebiusLLMService.InputParams(
101+
temperature=0.7,
102+
top_p=0.9,
103+
max_tokens=1000
104+
)
105+
)
106+
107+
# Set up conversation context
108+
messages = [
109+
{
110+
"role": "system",
111+
"content": "You are a helpful assistant powered by Nebius AI Studio."
112+
}
113+
]
114+
115+
# Optional: Add function calling capabilities
116+
tools = ToolsSchema([
117+
FunctionSchema(
118+
name="get_current_weather",
119+
description="Get the current weather in a given location",
120+
properties={
121+
"location": {
122+
"type": "string",
123+
"description": "The city and state, e.g. San Francisco, CA"
124+
},
125+
"unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
126+
},
127+
required=["location"]
128+
)
129+
])
130+
131+
context = OpenAILLMContext(messages, tools)
132+
from pipecat.processors.aggregators.llm_response import LLMUserAggregatorParams
133+
134+
context_aggregator = llm.create_context_aggregator(
135+
context,
136+
user_params=LLMUserAggregatorParams(aggregation_timeout=0.1)
137+
)
138+
139+
# Register function handler
140+
async def fetch_weather(params):
141+
location = params.arguments["location"]
142+
await params.result_callback({"conditions": "sunny", "temperature": "22°C"})
143+
144+
llm.register_function("get_current_weather", fetch_weather)
145+
146+
# Optional: Add function call feedback for better UX
147+
@llm.event_handler("on_function_calls_started")
148+
async def on_function_calls_started(service, function_calls):
149+
await tts.queue_frame(TTSSpeakFrame("Let me check that for you."))
150+
151+
# Use in pipeline
152+
pipeline = Pipeline([
153+
transport.input(),
154+
stt, # Your preferred STT service
155+
context_aggregator.user(),
156+
llm,
157+
tts, # Your preferred TTS service
158+
transport.output(),
159+
context_aggregator.assistant()
160+
])
161+
```
162+
163+
## Available Models
164+
165+
Nebius AI Studio provides access to various state-of-the-art models:
166+
167+
- `meta-llama/Meta-Llama-3.1-8B-Instruct-fast` - Default fast model (recommended)
168+
- `meta-llama/Meta-Llama-3.1-70B-Instruct` - Larger model for complex tasks
169+
- `meta-llama/Meta-Llama-3.1-405B-Instruct` - Most capable model
170+
171+
<Tip>
172+
Check the [Nebius AI Studio Console](https://studio.nebius.ai/) for the latest available models and pricing.
173+
</Tip>
174+
175+
## Configuration
176+
177+
```python
178+
# Custom configuration example
179+
llm = NebiusLLMService(
180+
api_key=os.getenv("NEBIUS_API_KEY"),
181+
base_url="https://api.studio.nebius.ai/v1", # Default base URL
182+
model="meta-llama/Meta-Llama-3.1-70B-Instruct",
183+
params=NebiusLLMService.InputParams(
184+
temperature=0.3, # Lower temperature for more focused responses
185+
top_p=0.8,
186+
max_tokens=2048,
187+
frequency_penalty=0.1
188+
)
189+
)
190+
```
191+
192+
## Metrics
193+
194+
Inherits all OpenAI metrics capabilities:
195+
196+
- **Time to First Byte (TTFB)** - Response latency measurements
197+
- **Processing Duration** - Model processing times
198+
- **Token Usage** - Prompt tokens, completion tokens, and totals
199+
200+
<Info>
201+
[Learn how to enable Metrics](/guides/fundamentals/metrics) in your Pipeline.
202+
</Info>
203+
204+
## Additional Notes
205+
206+
- **OpenAI Compatibility**: Full compatibility with OpenAI API features and parameters
207+
- **High Performance**: Optimized for low-latency conversational AI applications
208+
- **Enterprise Ready**: Built on Nebius cloud infrastructure for reliability and scale
209+
- **Cost Effective**: Competitive pricing for high-quality language models
210+
- **Multi-language Support**: Models support multiple languages and regions

env.example

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -129,6 +129,9 @@ TWILIO_AUTH_TOKEN=...
129129
MINIMAX_API_KEY=...
130130
MINIMAX_GROUP_ID=...
131131

132+
# Nebius AI Studio
133+
NEBIUS_API_KEY=...
134+
132135
# Sarvam AI
133136
SARVAM_API_KEY=...
134137

0 commit comments

Comments
 (0)