Skip to content

Commit 1b484d5

Browse files
committed
Rebase latest main
commit ec32383 Author: Neevash Ramdial (Nash) <mail@neevash.dev> Date: Mon Oct 27 15:51:53 2025 -0600 mypy clean up (GetStream#130) commit c52fe4c Author: Neevash Ramdial (Nash) <mail@neevash.dev> Date: Mon Oct 27 15:28:00 2025 -0600 remove turn keeping from example (GetStream#129) commit e1072e8 Merge: 5bcffa3 fea101a Author: Yarik <43354956+yarikdevcom@users.noreply.github.com> Date: Mon Oct 27 14:28:05 2025 +0100 Merge pull request GetStream#106 from tjirab/feat/20251017_gh-labeler feat: Github pull request labeler commit 5bcffa3 Merge: 406673c bfe888f Author: Thierry Schellenbach <thierry@getstream.io> Date: Sat Oct 25 10:56:27 2025 -0600 Merge pull request GetStream#119 from GetStream/fix-screensharing Fix screensharing commit bfe888f Merge: 8019c14 406673c Author: Thierry Schellenbach <thierry@getstream.io> Date: Sat Oct 25 10:56:15 2025 -0600 Merge branch 'main' into fix-screensharing commit 406673c Author: Stefan Blos <stefan.blos@gmail.com> Date: Sat Oct 25 03:03:10 2025 +0200 Update README (GetStream#118) * Changed README to LaRaes version * Remove arrows from table * Add table with people & projects to follow * Update images and links in README.md commit 3316908 Author: Tommaso Barbugli <tbarbugli@gmail.com> Date: Fri Oct 24 23:48:06 2025 +0200 Simplify TTS plugin and audio utils (GetStream#123) - Simplified TTS plugin - AWS Polly TTS plugin - OpenAI TTS plugin - Improved audio utils commit 8019c14 Author: Max Kahan <max.kahan@getstream.io> Date: Fri Oct 24 17:32:26 2025 +0100 remove video forwarder lazy init commit ca62d37 Author: Max Kahan <max.kahan@getstream.io> Date: Thu Oct 23 16:44:03 2025 +0100 use correct codec commit 8cf8788 Author: Max Kahan <max.kahan@getstream.io> Date: Thu Oct 23 14:27:18 2025 +0100 rename variable to fix convention commit 33fd70d Author: Max Kahan <max.kahan@getstream.io> Date: Thu Oct 23 14:24:42 2025 +0100 unsubscribe from events commit 3692131 Author: Max Kahan <max.kahan@getstream.io> Date: Thu Oct 23 14:19:53 2025 +0100 remove nonexistent type commit c5f68fe Author: Max Kahan <max.kahan@getstream.io> Date: Thu Oct 23 14:10:07 2025 +0100 cleanup tests to fit style commit 8b3c61a Author: Max Kahan <max.kahan@getstream.io> Date: Thu Oct 23 13:55:08 2025 +0100 clean up resources when track cancelled commit d8e08cb Author: Max Kahan <max.kahan@getstream.io> Date: Thu Oct 23 13:24:55 2025 +0100 fix track republishing in agent commit 0f8e116 Author: Max Kahan <max.kahan@getstream.io> Date: Wed Oct 22 15:37:11 2025 +0100 add tests commit 08e6133 Author: Max Kahan <max.kahan@getstream.io> Date: Wed Oct 22 15:25:37 2025 +0100 ensure video track dimensions are an even number commit 6a725b0 Merge: 5f001e0 5088709 Author: Thierry Schellenbach <thierry@getstream.io> Date: Thu Oct 23 15:23:58 2025 -0600 Merge pull request GetStream#122 from GetStream/cleanup_stt Cleanup STT commit 5088709 Author: Thierry Schellenbach <thierry@getstream.io> Date: Thu Oct 23 15:23:34 2025 -0600 cleanup of stt commit f185120 Author: Thierry Schellenbach <thierry@getstream.io> Date: Thu Oct 23 15:08:42 2025 -0600 more cleanup commit 05ccbfd Author: Thierry Schellenbach <thierry@getstream.io> Date: Thu Oct 23 14:51:48 2025 -0600 cleanup commit bb834ca Author: Thierry Schellenbach <thierry@getstream.io> Date: Thu Oct 23 14:28:53 2025 -0600 more cleanup for stt commit 7a3f2d2 Author: Thierry Schellenbach <thierry@getstream.io> Date: Thu Oct 23 14:11:35 2025 -0600 more test cleanup commit ad7f4fe Author: Thierry Schellenbach <thierry@getstream.io> Date: Thu Oct 23 14:10:57 2025 -0600 cleanup test commit 9e50cdd Author: Thierry Schellenbach <thierry@getstream.io> Date: Thu Oct 23 14:03:45 2025 -0600 large cleanup commit 5f001e0 Merge: 95a03e4 5d204f3 Author: Thierry Schellenbach <thierry@getstream.io> Date: Thu Oct 23 12:01:52 2025 -0600 Merge pull request GetStream#121 from GetStream/fish_stt [AI-201] Fish speech to text (partial) commit 5d204f3 Author: Thierry Schellenbach <thierry@getstream.io> Date: Thu Oct 23 11:48:16 2025 -0600 remove ugly tests commit ee9a241 Author: Thierry Schellenbach <thierry@getstream.io> Date: Thu Oct 23 11:46:19 2025 -0600 cleanup commit 6eb8270 Author: Thierry Schellenbach <thierry@getstream.io> Date: Thu Oct 23 11:23:00 2025 -0600 fix 48khz support commit 3b90548 Author: Thierry Schellenbach <thierry@getstream.io> Date: Thu Oct 23 10:59:08 2025 -0600 first attempt at fish stt, doesnt entirely work just yet commit 95a03e4 Merge: b90c9e3 b4c0da8 Author: Tommaso Barbugli <tbarbugli@gmail.com> Date: Thu Oct 23 10:11:39 2025 +0200 Merge branch 'main' of github.com:GetStream/Vision-Agents commit b90c9e3 Author: Tommaso Barbugli <tbarbugli@gmail.com> Date: Wed Oct 22 23:28:28 2025 +0200 remove print and double event handling commit b4c0da8 Merge: 3d06446 a426bc2 Author: Thierry Schellenbach <thierry@getstream.io> Date: Wed Oct 22 15:08:51 2025 -0600 Merge pull request GetStream#117 from GetStream/openrouter [AI-194] Openrouter commit a426bc2 Author: Thierry Schellenbach <thierry@getstream.io> Date: Wed Oct 22 15:03:10 2025 -0600 skip broken test commit ba6c027 Author: Thierry Schellenbach <thierry@getstream.io> Date: Wed Oct 22 14:50:23 2025 -0600 almost working openrouter commit 0b1c873 Author: Thierry Schellenbach <thierry@getstream.io> Date: Wed Oct 22 14:47:12 2025 -0600 almost working, just no instruction following commit ce63233 Author: Thierry Schellenbach <thierry@getstream.io> Date: Wed Oct 22 14:35:53 2025 -0600 working memory for openai commit 149e886 Author: Thierry Schellenbach <thierry@getstream.io> Date: Wed Oct 22 13:32:43 2025 -0600 todo commit e0df1f6 Author: Thierry Schellenbach <thierry@getstream.io> Date: Wed Oct 22 13:20:38 2025 -0600 first pass at adding openrouter commit 3d06446 Merge: 4eb8ef4 ef55d66 Author: Thierry Schellenbach <thierry@getstream.io> Date: Wed Oct 22 13:20:11 2025 -0600 Merge branch 'main' of github.com:GetStream/Vision-Agents commit 4eb8ef4 Author: Thierry Schellenbach <thierry@getstream.io> Date: Wed Oct 22 13:20:01 2025 -0600 cleanup ai plugin instructions commit ef55d66 Author: Thierry Schellenbach <thierry@getstream.io> Date: Wed Oct 22 12:54:33 2025 -0600 Add link to stash_pomichter for spatial memory commit 9c9737f Merge: c954409 390c45b Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 19:45:09 2025 -0600 Merge pull request GetStream#115 from GetStream/fish [AI-195] Fish support commit 390c45b Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 19:44:37 2025 -0600 cleannup commit 1cc1cf1 Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 19:42:03 2025 -0600 happy tests commit 8163d32 Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 19:39:21 2025 -0600 fix gemini rule following commit ada3ac9 Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 19:20:18 2025 -0600 fish tts commit 61a26cf Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 16:44:03 2025 -0600 attempt at fish commit c954409 Merge: ab27e48 c71da10 Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 14:18:15 2025 -0600 Merge pull request GetStream#104 from GetStream/bedrock [AI-192] - Bedrock, AWS & Nova commit c71da10 Author: Tommaso Barbugli <tbarbugli@gmail.com> Date: Tue Oct 21 22:00:25 2025 +0200 maybe commit b5482da Author: Tommaso Barbugli <tbarbugli@gmail.com> Date: Tue Oct 21 21:46:15 2025 +0200 debugging commit 9a36e45 Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 13:14:58 2025 -0600 echo environment name commit 6893968 Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 12:53:58 2025 -0600 more debugging commit c35fc47 Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 12:45:44 2025 -0600 add some debug info commit 0d6d3fd Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 12:03:13 2025 -0600 run test fix commit c3a31bd Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 11:52:25 2025 -0600 log cache hit commit 04554ae Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 11:48:03 2025 -0600 fix glob commit 7da96db Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 11:33:56 2025 -0600 mypy commit 186053f Merge: 4b540c9 ab27e48 Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 11:17:17 2025 -0600 happy tests commit 4b540c9 Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 10:20:04 2025 -0600 happy tests commit b05a60a Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 09:17:45 2025 -0600 add readme commit 71affcc Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 09:13:01 2025 -0600 rename to aws commit d2eeba7 Author: Thierry Schellenbach <thierry@getstream.io> Date: Mon Oct 20 21:32:01 2025 -0600 ai tts instructions commit 98a4f9d Author: Thierry Schellenbach <thierry@getstream.io> Date: Mon Oct 20 16:49:00 2025 -0600 small edits commit ab27e48 Author: Tommaso Barbugli <tbarbugli@gmail.com> Date: Mon Oct 20 21:42:04 2025 +0200 Ensure user agent is initialized before joining the call (GetStream#113) * ensure user agent is initialized before joining the call * wip commit 3cb339b Author: Tommaso Barbugli <tbarbugli@gmail.com> Date: Mon Oct 20 21:22:57 2025 +0200 New conversation API (GetStream#102) * trying to resurrect * test transcription events for openai * more tests for openai and gemini llm * more tests for openai and gemini llm * update py-client * wip * ruff * wip * ruff * snap * another way * another way, a better way * ruff * ruff * rev * ruffit * mypy everything * brief * tests * openai dep bump * snap - broken * nothingfuckingworks * message id * fix test * ruffit commit cb6f00a Author: Thierry Schellenbach <thierry@getstream.io> Date: Mon Oct 20 13:18:03 2025 -0600 use qwen commit f84b2ad Author: Thierry Schellenbach <thierry@getstream.io> Date: Mon Oct 20 13:02:24 2025 -0600 fix tests commit e61acca Author: Thierry Schellenbach <thierry@getstream.io> Date: Mon Oct 20 12:50:40 2025 -0600 testing and linting commit 5f4d353 Author: Thierry Schellenbach <thierry@getstream.io> Date: Mon Oct 20 12:34:14 2025 -0600 working commit c2a15a9 Merge: a310771 1025a42 Author: Thierry Schellenbach <thierry@getstream.io> Date: Mon Oct 20 11:40:00 2025 -0600 Merge branch 'main' of github.com:GetStream/Vision-Agents into bedrock commit a310771 Author: Thierry Schellenbach <thierry@getstream.io> Date: Mon Oct 20 11:39:48 2025 -0600 wip commit b4370f4 Author: Thierry Schellenbach <thierry@getstream.io> Date: Mon Oct 20 11:22:43 2025 -0600 something isn't quite working commit 2dac975 Author: Thierry Schellenbach <thierry@getstream.io> Date: Mon Oct 20 10:30:04 2025 -0600 add the examples commit 6885289 Author: Thierry Schellenbach <thierry@getstream.io> Date: Sun Oct 19 20:19:42 2025 -0600 ai realtime docs commit a0fa3cc Author: Thierry Schellenbach <thierry@getstream.io> Date: Sun Oct 19 18:48:06 2025 -0600 wip commit b914fc3 Author: Thierry Schellenbach <thierry@getstream.io> Date: Sun Oct 19 18:40:22 2025 -0600 fix ai llm commit b5b00a7 Author: Thierry Schellenbach <thierry@getstream.io> Date: Sun Oct 19 17:11:26 2025 -0600 work audio input commit ac72260 Author: Thierry Schellenbach <thierry@getstream.io> Date: Sun Oct 19 16:47:19 2025 -0600 fix model id commit 2b5863c Author: Thierry Schellenbach <thierry@getstream.io> Date: Sun Oct 19 16:32:54 2025 -0600 wip on bedrock commit 8bb4162 Author: Thierry Schellenbach <thierry@getstream.io> Date: Fri Oct 17 15:22:03 2025 -0600 next up the connect method commit 7a21e4e Author: Thierry Schellenbach <thierry@getstream.io> Date: Fri Oct 17 14:12:00 2025 -0600 nova progress commit 16e8ba0 Author: Thierry Schellenbach <thierry@getstream.io> Date: Fri Oct 17 13:16:00 2025 -0600 docs for bedrock nova commit 1025a42 Author: Bart Schuijt <schuijt.bart@gmail.com> Date: Fri Oct 17 21:05:45 2025 +0200 fix: Update .env.example for Gemini Live (GetStream#108) commit e12112d Author: Thierry Schellenbach <thierry@getstream.io> Date: Fri Oct 17 11:49:07 2025 -0600 wip commit fea101a Author: Bart Schuijt <schuijt.bart@gmail.com> Date: Fri Oct 17 09:25:55 2025 +0200 workflow file update commit bb2d74c Author: Bart Schuijt <schuijt.bart@gmail.com> Date: Fri Oct 17 09:22:33 2025 +0200 initial commit commit d2853cd Author: Thierry Schellenbach <thierry@getstream.io> Date: Thu Oct 16 19:44:59 2025 -0600 always remember pep 420 commit 30a8eca Author: Thierry Schellenbach <thierry@getstream.io> Date: Thu Oct 16 19:36:58 2025 -0600 start of bedrock branch commit fc032bf Author: Tommaso Barbugli <tbarbugli@gmail.com> Date: Thu Oct 16 09:17:42 2025 +0200 Remove cli handler from examples (GetStream#101) commit 39a821d Author: Dan Gusev <dangusev92@gmail.com> Date: Tue Oct 14 12:20:41 2025 +0200 Update Deepgram plugin to use SDK v5.0.0 (GetStream#98) * Update Deepgram plugin to use SDK v5.0.0 * Merge test_realtime and test_stt and update the remaining tests * Make deepgram.STT.start() idempotent * Clean up unused import * Use uv as the default package manager > pip --------- Co-authored-by: Neevash Ramdial (Nash) <mail@neevash.dev> commit 2013be5 Author: Tommaso Barbugli <tbarbugli@gmail.com> Date: Mon Oct 13 16:57:37 2025 +0200 ensure chat works with default types (GetStream#99)
1 parent efaa529 commit 1b484d5

File tree

154 files changed

+18485
-11222
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

154 files changed

+18485
-11222
lines changed

.cursor/rules/python.mdc

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,4 +14,12 @@ docstrings should follow the google style guides for docstrings.
1414
integration tests use
1515
@pytest.mark.integration
1616

17-
@pytest.mark.asyncio is not needed (its automatic)
17+
@pytest.mark.asyncio is not needed (its automatic)
18+
19+
---
20+
description: when running python or tests
21+
globs:
22+
alwaysApply: true
23+
---
24+
25+
we use uv on this project, no python -m non-sense please. If you get in trouble with deps just stop and ask, better to have the human resolve things (no sudo brew kind of stuff please)

.github/actions/python-uv-setup/action.yml

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -17,12 +17,20 @@ runs:
1717
using: composite
1818
steps:
1919
- name: Install the latest version of uv
20-
uses: astral-sh/setup-uv@v6
20+
uses: astral-sh/setup-uv@v7
2121
with:
2222
python-version: "3.13"
2323
version: "latest"
2424
enable-cache: true
25-
cache-dependency-glob: "uv.lock"
25+
cache-dependency-glob: |
26+
**/pyproject.toml
27+
**/uv.lock
28+
- name: Log cache status
29+
shell: bash
30+
env:
31+
UV_CACHE_HIT: "${{ steps.setup_uv.outputs['cache-hit'] }}"
32+
run: |
33+
echo "setup-uv cache hit: ${UV_CACHE_HIT}"
2634
2735
- name: Install the project
2836
shell: bash

.github/labeler.yml

Lines changed: 165 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,165 @@
1+
---
2+
# Core Framework Components
3+
agents-core:
4+
- agents-core/**
5+
- '!agents-core/**/tests/**'
6+
- '!agents-core/**/__pycache__/**'
7+
8+
# Plugin System
9+
plugins:
10+
- plugins/**
11+
- '!plugins/**/tests/**'
12+
- '!plugins/**/__pycache__/**'
13+
14+
# Specific Plugin Labels
15+
plugin-getstream:
16+
- plugins/getstream/**
17+
- '!plugins/getstream/**/tests/**'
18+
19+
plugin-openai:
20+
- plugins/openai/**
21+
- '!plugins/openai/**/tests/**'
22+
23+
plugin-gemini:
24+
- plugins/gemini/**
25+
- '!plugins/gemini/**/tests/**'
26+
27+
plugin-deepgram:
28+
- plugins/deepgram/**
29+
- '!plugins/deepgram/**/tests/**'
30+
31+
plugin-ultralytics:
32+
- plugins/ultralytics/**
33+
- '!plugins/ultralytics/**/tests/**'
34+
35+
plugin-elevenlabs:
36+
- plugins/elevenlabs/**
37+
- '!plugins/elevenlabs/**/tests/**'
38+
39+
plugin-cartesia:
40+
- plugins/cartesia/**
41+
- '!plugins/cartesia/**/tests/**'
42+
43+
plugin-kokoro:
44+
- plugins/kokoro/**
45+
- '!plugins/kokoro/**/tests/**'
46+
47+
plugin-moonshine:
48+
- plugins/moonshine/**
49+
- '!plugins/moonshine/**/tests/**'
50+
51+
plugin-silero:
52+
- plugins/silero/**
53+
- '!plugins/silero/**/tests/**'
54+
55+
plugin-smart-turn:
56+
- plugins/smart_turn/**
57+
- '!plugins/smart_turn/**/tests/**'
58+
59+
plugin-wizper:
60+
- plugins/wizper/**
61+
- '!plugins/wizper/**/tests/**'
62+
63+
plugin-xai:
64+
- plugins/xai/**
65+
- '!plugins/xai/**/tests/**'
66+
67+
plugin-krisp:
68+
- plugins/krisp/**
69+
- '!plugins/krisp/**/tests/**'
70+
71+
plugin-anthropic:
72+
- plugins/anthropic/**
73+
- '!plugins/anthropic/**/tests/**'
74+
75+
# Examples and Demos
76+
examples:
77+
- examples/**
78+
- '!examples/**/tests/**'
79+
- '!examples/**/__pycache__/**'
80+
81+
# Testing
82+
tests:
83+
- tests/**
84+
- '**/tests/**'
85+
- '**/test_*.py'
86+
- '**/*_test.py'
87+
88+
# Documentation
89+
docs:
90+
- docs/**
91+
- '*.md'
92+
- '!README.md'
93+
94+
# Configuration and Build
95+
config:
96+
- '*.toml'
97+
- '*.yml'
98+
- '*.yaml'
99+
- '*.json'
100+
- '*.ini'
101+
- '*.cfg'
102+
- 'pyproject.toml'
103+
- 'pytest.ini'
104+
- 'conftest.py'
105+
106+
# CI/CD and GitHub
107+
ci:
108+
- '.github/**'
109+
- '*.yml'
110+
- '*.yaml'
111+
112+
# Core Agent System
113+
core-agents:
114+
- agents-core/vision_agents/core/agents/**
115+
- agents-core/vision_agents/core/events/**
116+
- agents-core/vision_agents/core/edge/**
117+
118+
# Core Infrastructure
119+
core-infrastructure:
120+
- agents-core/vision_agents/core/llm/**
121+
- agents-core/vision_agents/core/stt/**
122+
- agents-core/vision_agents/core/tts/**
123+
- agents-core/vision_agents/core/vad/**
124+
- agents-core/vision_agents/core/turn_detection/**
125+
- agents-core/vision_agents/core/processors/**
126+
- agents-core/vision_agents/core/mcp/**
127+
- agents-core/vision_agents/core/observability/**
128+
- agents-core/vision_agents/core/utils/**
129+
130+
# CLI and Development Tools
131+
cli:
132+
- agents-core/vision_agents/core/cli.py
133+
- dev.py
134+
- DEVELOPMENT.md
135+
136+
# Dependencies
137+
dependencies:
138+
- 'uv.lock'
139+
- 'requirements*.txt'
140+
- 'poetry.lock'
141+
- 'Pipfile.lock'
142+
143+
# Assets and Resources
144+
assets:
145+
- assets/**
146+
- '*.png'
147+
- '*.jpg'
148+
- '*.jpeg'
149+
- '*.gif'
150+
- '*.mp4'
151+
- '*.wav'
152+
- '*.mp3'
153+
154+
# License and Legal
155+
legal:
156+
- LICENSE
157+
- LICENSE.*
158+
- '*.license'
159+
160+
# README and Project Info
161+
project-info:
162+
- README.md
163+
- CHANGELOG.md
164+
- CONTRIBUTING.md
165+
- SECURITY.md

.github/workflows/ci.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ jobs:
1010
uses: ./.github/workflows/run_tests.yml
1111
with:
1212
marker: 'not integration'
13-
13+
secrets: inherit
1414

1515
# Cancel in-flight runs for the same branch/PR when new commits arrive
1616
concurrency:

.github/workflows/labeler.yml

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
name: "Pull Request Labeler"
2+
on:
3+
- pull_request_target
4+
5+
jobs:
6+
labeler:
7+
permissions:
8+
contents: read
9+
pull-requests: write
10+
runs-on: ubuntu-latest
11+
steps:
12+
- uses: actions/labeler@v5

.github/workflows/run_tests.yml

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,9 @@ jobs:
2424
run: uv run mypy --install-types --non-interactive agents-core/vision_agents
2525
- name: Mypy plugins
2626
run: uv run mypy --install-types --non-interactive --exclude 'plugins/.*/tests/.*' plugins
27+
- name: Show environment name
28+
run: |
29+
echo "Environment: ${{ job.environment }}"
2730
2831
test:
2932
name: Test "${{ inputs.marker }}"
@@ -40,11 +43,24 @@ jobs:
4043
STREAM_API_KEY: ${{ secrets.STREAM_API_KEY }}
4144
STREAM_API_SECRET: ${{ secrets.STREAM_API_SECRET }}
4245
XAI_API_KEY: ${{ secrets.XAI_API_KEY }}
46+
AWS_BEARER_TOKEN_BEDROCK: "${{ secrets.AWS_BEARER_TOKEN_BEDROCK }}"
47+
_BEARER_TOKEN_BEDROCK: "${{ secrets.AWS_BEARER_TOKEN_BEDROCK }}"
4348
timeout-minutes: 30
4449
steps:
4550
- name: Checkout
4651
uses: actions/checkout@v5
4752
- uses: ./.github/actions/python-uv-setup
53+
- name: Export AWS_BEARER_TOKEN_BEDROCK (heredoc)
54+
shell: bash
55+
run: |
56+
{
57+
echo 'AWS_BEARER_TOKEN_BEDROCK<<EOF'
58+
echo "${{ secrets.AWS_BEARER_TOKEN_BEDROCK }}"
59+
echo 'EOF'
60+
} >> "$GITHUB_ENV"
61+
62+
- name: Verify presence
63+
run: uv run python -c "import os; v=os.getenv('AWS_BEARER_TOKEN_BEDROCK'); print('exists', v is not None, 'len', 0 if v is None else len(v))"
4864
- name: Run test
4965
run: uv run pytest -n auto -m "${{ inputs.marker }}"
5066
- name: Run plugin test

DEVELOPMENT.md

Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -109,6 +109,73 @@ To see how the agent work open up agents.py
109109
* The LLM uses the VideoForwarder to write the video to a websocket or webrtc connection
110110
* The STS writes the reply on agent.llm.audio_track and the RealtimeTranscriptEvent / RealtimePartialTranscriptEvent
111111

112+
## Audio management
113+
114+
Some important things about audio inside the library:
115+
116+
1. WebRTC uses Opus 48khz stereo but inside the library audio is always in PCM format
117+
2. Plugins / AI models work with different PCM formats, usually 16khz mono
118+
3. PCM data is always passed around using the `PcmData` object which contains information about sample rate, channels and format
119+
4. Text-to-speech plugins automatically return PCM in the format needed by WebRTC. This is exposed via the `set_output_format` method
120+
5. Audio resampling can be done using `PcmData.resample` method
121+
6. When resampling audio in chunks, it is important to re-use the same `av.AudioResampler` resampler (see `PcmData.resample` and `core.tts.TTS`)
122+
7. Adjusting from stereo to mono and vice-versa can be done using the `PcmData.resample` method
123+
124+
Some ground rules:
125+
126+
1. Do not build code to resample / adjust audio unless it is not covered already by `PcmData`
127+
2. Do not pass PCM as plain bytes around and write code that assumes specific sample rate or format. Use `PcmData` instead
128+
129+
## Example
130+
131+
```python
132+
import asyncio
133+
from vision_agents.core.edge.types import PcmData
134+
from openai import AsyncOpenAI
135+
136+
async def example():
137+
client = AsyncOpenAI(api_key="sk-42")
138+
139+
resp = await client.audio.speech.create(
140+
model="gpt-4o-mini-tts",
141+
voice="alloy",
142+
input="pcm is cool, give me some of that please",
143+
response_format="pcm",
144+
)
145+
146+
# load response into PcmData, note that you need to specify sample_rate, channels and format
147+
pcm_data = PcmData.from_bytes(
148+
resp.content, sample_rate=24_000, channels=1, format="s16"
149+
)
150+
151+
# check if pcm_data is stereo (it's not in this case ofc)
152+
print(pcm_data.stereo)
153+
154+
# write the pcm to file
155+
with open("test.wav", "wb") as f:
156+
f.write(pcm_data.to_wav_bytes())
157+
158+
# resample pcm to be 48khz stereo
159+
resampled_pcm = pcm_data.resample(48_000, 2)
160+
161+
# play-out pcm using ffplay
162+
from vision_agents.core.edge.types import play_pcm_with_ffplay
163+
164+
await play_pcm_with_ffplay(resampled_pcm)
165+
166+
if __name__ == "__main__":
167+
asyncio.run(example())
168+
```
169+
170+
171+
### Testing audio manually
172+
173+
Sometimes you need to test audio manually, here's some tips:
174+
175+
1. Do not use earplugs when testing PCM playback ;)
176+
2. You can use the `PcmData.to_wav_bytes` method to convert PCM into wav bytes (see `manual_tts_to_wav` for an example)
177+
3. If you have `ffplay` installed, you can playback pcm directly to check if audio is correct
178+
112179
## Dev / Contributor Guidelines
113180

114181
### Light wrapping

0 commit comments

Comments
 (0)