Conversation
- Bug 1: Duplicate Soniox streams for same base language (en-US vs en-US?hints=ja), 2x cost - Bug 2: Stream killed during sub update, app not re-routed to surviving stream (captions freeze) - Bug 3: Reconnect grace window + rapid syncManagers() = stream churn during boot storm - Bug 4: getTranscriptionSubscriptions() doesn't deduplicate by base language - Bug 5: getTargetSubscriptions() is dead code (passthrough), relay works by accident Observed on dev server Feb 14 2026: com.mentra.captions.debug went silent after subscription update killed its stream, surviving stream existed but app wasn't re-routed.
- Added complete 7-phase data flow trace: SDK handler registration → cloud subscription processing → Soniox stream creation → audio flow → token production → cloud relay → SDK dispatch - Refined to 7 confirmed bugs + 1 uncertain: Bug 1 (HIGH): Duplicate Soniox streams for same base language Bug 2 (HIGH): SDK streamType mismatch - silent data loss Bug 3 (HIGH): Stream killed, ensureStreamsExist is stream-centric not app-centric Bug 4 (MED): Boot storm subscription churn Bug 5 (LOW): getTargetSubscriptions dead code Bug 6 (LOW): Cloud relays to apps that can't receive (wasted network) Bug 7 (LOW): onClosed/stopStream race on activeSubscriptions Bug 8 (UNCERTAIN): VAD + subscription update interaction - Added bug interaction diagram showing Bugs 1+2 MUST be fixed together - Added TranslationManager comparison (same structural bugs) - Added VAD interaction analysis - Documented the commented-out SDK reconstruction code and why it's insufficient
6 changes across cloud and SDK: 1. Normalize stream identity to base language (one Soniox stream per lang) 2. DataStream.streamType uses base language form (automatic via closure) 3. SDK findMatchingStream() for language-aware handler dispatch 4. getTranscriptionSubscriptions() unchanged - dedup in TranscriptionManager 5. SDK logging for unmatched DataStream messages 6. Remove dead getTargetSubscriptions() code Key decisions: - Merge hints from all subscribers (union, deduplicated) - no-language-identification: enabled unless ALL subscribers disable - Config changes picked up on next VAD cycle (no stream recreation) - TranslationManager deferred to separate PR - findMatchingStream O(n) with n≤5 is fine, no secondary index needed Backward compatible: SDK fix improves behavior regardless of cloud version. Cloud fix alone has same behavior as Bug 2 today (no regression).
Critical change: Instead of sending base-language streamType (which breaks old SDK apps using hints/disableLanguageIdentification), the cloud now looks up each app's OWN subscription string and uses that as DataStream.streamType. New Change 2: Per-app streamType routing - findAppTranscriptionSubscription() looks up app's actual sub for the base language being relayed - Old SDK exact match succeeds because streamType === app's handler key - New SDK findMatchingStream() also works (safety net) - Zero regressions for any SDK version Updated decision log, backward compat matrix, and behaviors.
Complete implementation plan with before/after code for all 6 changes: Cloud (TranscriptionManager.ts ~120 lines changed): - Change 1: normalizeToBaseLanguage(), getMergedOptionsForLanguage(), buildSubscriptionWithOptions(), updated updateSubscriptions() and createStreamInstance(). New rawSubscriptions field. - Change 2: findAppTranscriptionSubscription() + per-app streamType in relayDataToApps(). Backward compat with old SDKs. - Change 6: Delete getTargetSubscriptions(), inline in relayDataToApps(). SDK (events.ts ~45 lines, index.ts ~20 lines): - Change 3: findMatchingStream() on EventManager - language-aware lookup with exact-match fast path. - Change 4+5: handleMessage() uses findMatchingStream(), adds debug log for unmatched DataStream. Deletes commented-out reconstruction code. Test files (2 new, ~180 lines total): - TranscriptionManager.dedup.test.ts - events.findMatchingStream.test.ts Rollout: cloud to debug first (verify stream dedup + old SDK compat), then SDK to debug, then standard dev→staging→prod.
Cloud (TranscriptionManager.ts): - Normalize stream identity to base language (strips ?hints=, ?no-language-identification=) - One Soniox stream per base language instead of per-subscription-string - Merge hints from all subscribers (union of hint arrays) - Per-app streamType in DataStream for backward compat with old SDKs - Delete dead code getTargetSubscriptions(), inline in relayDataToApps() SDK (events.ts, index.ts): - Add findMatchingStream() for language-aware handler matching - 'transcription:en-US' now matches handler for 'transcription:en-US?hints=ja' - Log unmatched DataStream messages (previously silent black hole)
… exports - Skip emitting interim when hasEndToken is true in SonioxTranscriptionProvider (prevents apps receiving same text twice as both interim + final) - Add 38 unit tests for TranscriptionManager dedup helpers - Add 18 unit tests for EventManager.findMatchingStream - Fix SDK types barrel: use export type for interfaces (fixes Bun test runner) - Document Mentra Live glasses VAD missing-finals issue (parked) - Regenerate bun.lock (fixes 190 failed packages)
📋 PR Review Helper📱 Mobile App Build✅ Ready to test! (commit 🕶️ ASG Client Build⏳ Waiting for build... 🔀 Test Locallygh pr checkout 2069 |
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Deploying mentra-store-dev with
|
| Latest commit: |
d6fa526
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://eed9c137.augmentos-appstore-2.pages.dev |
| Branch Preview URL: | https://cloud-transcription-stream-d.augmentos-appstore-2.pages.dev |
Deploying dev-augmentos-console with
|
| Latest commit: |
d6fa526
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://27b3d6df.dev-augmentos-console.pages.dev |
| Branch Preview URL: | https://cloud-transcription-stream-d.dev-augmentos-console.pages.dev |
Deploying prod-augmentos-account with
|
| Latest commit: |
d6fa526
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://a5349629.augmentos-e84.pages.dev |
| Branch Preview URL: | https://cloud-transcription-stream-d.augmentos-e84.pages.dev |
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 316d24e395
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| for (const subscription of normalizedDesired) { | ||
| if (!current.has(subscription)) { | ||
| await this.startStream(subscription); |
There was a problem hiding this comment.
Recreate stream when transcription options change
This loop only starts/stops streams when the normalized language key changes, so updates that keep the same base language but change options (hints, no-language-identification) never trigger a restart. Because merged options are only applied during createStreamInstance(), a live stream can keep stale config after subscription updates (for example switching from transcription:en-US?hints=ja to ...hints=fr), which silently leaves Soniox running with outdated hints/flags until an unrelated reconnect happens.
Useful? React with 👍 / 👎.
Deploying mentra-live-ota-site with
|
| Latest commit: |
d6fa526
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://5ea27b25.mentra-live-ota-site.pages.dev |
| Branch Preview URL: | https://cloud-transcription-stream-d.mentra-live-ota-site.pages.dev |
Transcription Stream Dedup & Subscription Routing Fix
Fixes 3 HIGH bugs: duplicate Soniox streams (2x cost), silent data loss from streamType mismatches, and double interim+final broadcasts.
Problem
"en-US"≠"en-US?hints=ja"→ two Soniox streamsChanges
Cloud — TranscriptionManager.ts
"transcription:en-US?hints=ja"→"transcription:en-US")streamType(backward compat with old SDKs)getTargetSubscriptions(), inlineCloud — SonioxTranscriptionProvider.ts
hasEndTokenis true — only emit the FINALSDK — events.ts
findMatchingStream(): language-aware handler lookup (ignores query params)SDK — index.ts
handleMessage()usesfindMatchingStream()instead of exact.includes()SDK — types/index.ts
export typefromexportfor interface re-exports (fixes Bun test runner)Tests
TranscriptionManager.dedup.test.ts— normalizeToBaseLanguage, getMergedOptionsForLanguage, buildSubscriptionWithOptions, findAppTranscriptionSubscription, updateSubscriptions dedup logicevents.findMatchingStream.test.ts— exact match, base language match, translation match, priority, edge casesBackward Compatibility
findMatchingStreammatches by base languagefindMatchingStreamis strictly better than exact matchEither side can be rolled back independently. No data migration.
Spec & Docs
issues/037-transcription-stream-dedup/spike.md— root cause analysisissues/037-transcription-stream-dedup/spec.md— behavioral spec & decision logissues/037-transcription-stream-dedup/design.md— line-level implementation planissues/037-transcription-stream-dedup/glasses-vad-missing-finals.md— known issue doc (parked, client may forward VAD events)