Skip to content

Make TTS optional for Assist - allow text-only responses #4209

@paymog

Description

@paymog

Problem

Currently, the iOS app requires TTS to be configured in the pipeline for Assist to work, even if the user only wants text responses. This forces users to:

  1. Configure a TTS provider they don't need
  2. Listen to audio responses they don't want
  3. Deal with TTS playback in quiet/public environments

The error when TTS is not configured:

PipelineRunValidationError: the pipeline does not support text-to-speech

Justification: Reading is Faster Than Listening

Research shows that reading text is significantly more efficient than listening to speech:

Metric Speed (wpm) Source
Average silent reading 238-260 ScienceDirect meta-analysis
Average TTS speech ~150 Standard TTS output
Efficiency gain ~60-70% Reading vs listening

Additionally:

  • Text can be skimmed/scanned; audio must be consumed linearly
  • TTS is disruptive in quiet environments (office, bedroom, public transit)
  • Some users simply prefer silent interactions

Precedent: ChatGPT App

The ChatGPT iOS app provides flexible voice interaction modes that Home Assistant should emulate:

Mode Input Output Use Case
Text only Keyboard Text Default, quiet environments
Transcription Voice (STT) Text Hands-free input, silent output
Full audio Voice (STT) Voice (TTS) Fully hands-free, driving

ChatGPT allows users to choose their interaction style without requiring all pipeline components. Home Assistant currently forces "Full audio" mode even when users only want "Text only" or "Transcription" modes.

Proposed Changes

  1. Remove TTS requirement for Assist - Allow pipelines with only STT + Conversation (no TTS) to work on mobile
  2. Add "Mute responses" toggle - For pipelines that have TTS configured, allow users to disable audio playback in app settings
  3. Text in/text out always works - Typing a query should never require TTS to be configured

Use Cases Enabled

  • ✅ Voice in → text out (STT only, no TTS needed)
  • ✅ Text in → text out (no STT, no TTS needed)
  • ✅ Voice in → voice out (current behavior, opt-in)

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions