Add Display name, voice ID list option by timonvanhasselt · Pull Request #64 · TigreGotico/phoonnx

timonvanhasselt · 2025-10-25T09:52:41Z

Possible update to address bullet 1 and 4 of this feature request: #62

Summary by CodeRabbit

New Features
- New CLI command to fetch and display available upstream voices grouped by source with counts and guidance.
- Download voices on-demand by ID.
- Voices now include user-friendly display names where available and persisted.
Improvements
- Listings prefer display names for clearer output.
- Expanded voice source coverage and more robust remote fetching with error handling.
Bug Fixes
- Minor formatting cleanup in command output and end-of-file newline consistency.

* Added helper to download a voice list without having to download all the model info to cache (available voices) * Added a new optional field 'display_name' to the TTS model manager. * Updated methods to handle and format the display name for piper, mimic3 and phoonnx voice models

Added a new command to list available voice IDs (phoonnx-voices list-available)

coderabbitai · 2025-10-25T09:52:50Z

Walkthrough

Adds display_name support to TTSModelInfo and propagates it through TTSModelManager; implements upstream voice discovery getters, on-demand download by voice ID, and a new list-available CLI command that summarizes available voices by source.

Changes

Cohort / File(s)	Summary
CLI: voice listing & discovery `phoonnx/cli.py`	Prefer `display_name` when printing voice info; remove extra blank line in `list-langs`; add `list-available` command that calls the manager to fetch upstream voice IDs, handles network/unexpected errors, and prints a structured summary grouped by source with counts and guidance.
Model manager: display names & discovery `phoonnx/model_manager.py`	Add `display_name: Optional[str]` to `TTSModelInfo` and format it in `__post_init__`; add `_fetch_voice_ids_from_url`, `_generate_display_name`, `_get_model_info_from_source`; add per-source getters (`get_available_*_voice_ids`) and `get_all_available_voice_ids`; add `download_voice_by_id`; persist `display_name` in `save`/`add_voice`; populate display_name for Piper, Mimic3, OVOS, Proxectonos, Phonikud with error handling and logging.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant CLI as CLI (list-available)
    participant Manager as TTSModelManager
    participant Sources as External Sources

    User->>CLI: run list-available
    CLI->>Manager: get_all_available_voice_ids()
    rect rgb(230,245,255)
    Manager->>Sources: fetch per-source voice lists (Piper, Mimic3, OVOS, Proxectonos, Phonikud)
    Sources-->>Manager: JSON / hardcoded IDs or errors
    end
    Manager->>Manager: aggregate & generate display_name where needed
    alt some sources fail
        Manager->>CLI: partial results + logged warnings
    else all succeed
        Manager->>CLI: dict(source → [voice ids/display names])
    end
    CLI->>User: print structured summary with counts and guidance

sequenceDiagram
    participant User
    participant CLI as CLI (download_voice_by_id)
    participant Manager as TTSModelManager
    participant Sources as External Sources
    participant Storage as Local Storage

    User->>CLI: download_voice_by_id(voice_id)
    CLI->>Manager: download_voice_by_id(voice_id)
    Manager->>Manager: parse voice_id prefix
    alt prefix recognized
        Manager->>Sources: _get_model_info_from_source(voice_id)
        Sources-->>Manager: metadata or error
        Manager->>Manager: _generate_display_name(metadata)
        Manager->>Storage: download_model(metadata)
        Storage-->>Manager: success
        Manager-->>CLI: TTSModelInfo created & persisted (includes display_name)
        CLI-->>User: success message
    else prefix not recognized or fetch fails
        Manager-->>CLI: error
        CLI-->>User: error message
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Review display_name formatting and placeholder substitution in TTSModelInfo.__post_init__.
Verify network/error handling and JSON parsing in _fetch_voice_ids_from_url.
Inspect per-source parsing and display_name generation (_generate_display_name, _get_model_info_from_source).
Confirm persistence changes (save/add_voice) include display_name and maintain compatibility.

Possibly related PRs

feat: opm and cli interface #51 — overlaps with CLI voice listing and discovery changes (includes list-available/CLI modifications).
feat: model manager #49 — overlaps with model manager helpers and display_name additions.

Suggested labels

feature

Poem

🐰 I hopped through lists of voices bright,
Gave names to whispers in morning light,
From Piper, Mimic, OVOS, too,
I stitched display names tidy and new—
Now voices sing with names in sight! 🎶

Pre-merge checks and finishing touches

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately reflects the main changes: adding display names and a voice ID list command option.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 5

🧹 Nitpick comments (6)

phoonnx/cli.py (1)
149-176: Optional: allow direct upstream downloads from CLI.
If you want the hint in list-available to be true, let download fetch upstream when not cached.
 @cli.command(name="download")
 @click.argument("voice_id", type=str)
 def download_voice(voice_id):
@@
-    if voice_id not in manager.voices:
-        click.echo(f"Error: Voice ID '{voice_id}' not found in cache.", err=True)
-        click.echo("Hint: Run 'phoonnx_cli.py update-cache' first to fetch the list.", err=True)
-        return
+    if voice_id not in manager.voices:
+        click.echo(f"Voice ID '{voice_id}' not found in cache. Trying upstream...")
+        ok = manager.download_voice_by_id(voice_id)
+        if not ok:
+            click.echo(f"Error: Could not find or download '{voice_id}' from upstream.", err=True)
+            click.echo("Hint: Run 'update-cache' to refresh known IDs.", err=True)
+            return
+        # Upstream fetch succeeded; reload cache so subsequent lookups work
+        manager.load()
@@
-    voice_info = manager.voices[voice_id]
+    voice_info = manager.voices[voice_id]
Note: depends on the proposed boolean return from download_voice_by_id.
phoonnx/model_manager.py (5)
193-206: Good: centralized upstream JSON fetch with timeout.
Consider adding response.json() error handling (JSONDecodeError) if you want to distinguish bad payloads; optional.

256-281: Display name generation OK; minor nits.

Comment typo: “Phonnikud” -> “Phonikud”.

Consider handling unexpected Piper key formats defensively.

385-395: Add timeout and raise_for_status for Piper list fetch.
Align with Mimic3 fetch and other requests calls.
-        piper_voices = requests.get(voice_list).json()
+        r = requests.get(voice_list, timeout=30)
+        r.raise_for_status()
+        piper_voices = r.json()
@@
-            except Exception:
-                print(f"Failed to get voice info for {v['key']}")
+            except Exception as e:
+                LOG.error(f"Failed to get voice info for {v.get('key', '<unknown>')}: {e}")
414-424: Mimic3 display_name wiring looks good.
Consistent with generator; minor nit: keep comments in English for consistency.

219-245: Log/message language consistency.
Several log strings are in Dutch. Consider English for project consistency.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between dfc5f87 and ea67295.

📒 Files selected for processing (2)

phoonnx/cli.py (3 hunks)
phoonnx/model_manager.py (11 hunks)

🧰 Additional context used

🧬 Code graph analysis (2)

phoonnx/model_manager.py (1)

phoonnx/config.py (2)

Engine (15-21)

PhonemeType (44-78)

phoonnx/cli.py (1)

phoonnx/model_manager.py (2)

TTSModelManager (128-491)

get_all_available_voice_ids (246-254)

🪛 Ruff (0.14.1)

phoonnx/model_manager.py

201-201: Consider moving this statement to an else block

(TRY300)

316-316: Prefer next(...) over single element slice

Replace with next(...)

(RUF015)

317-317: Prefer next(...) over single element slice

Replace with next(...)

(RUF015)

393-393: Prefer next(a for a in v["files"] if a.endswith(".onnx")) over single element slice

Replace with next(a for a in v["files"] if a.endswith(".onnx"))

(RUF015)

394-394: Prefer next(a for a in v["files"] if a.endswith(".json")) over single element slice

Replace with next(a for a in v["files"] if a.endswith(".json"))

(RUF015)

phoonnx/cli.py

132-132: f-string without any placeholders

Remove extraneous f prefix

(F541)

135-135: Do not catch blind exception: Exception

(BLE001)

🔇 Additional comments (6)

phoonnx/cli.py (3)

10-11: LGTM: display_name output is conditional and harmless.
No behavior change when absent; good default.

114-117: Prefer display_name with fallback: good UX.
Clearer listings while preserving ID fallback.

191-191: Entry point OK.
No change in behavior; fine.

phoonnx/model_manager.py (3)

162-164: Persistence includes display_name: good.
Field is saved/loaded consistently.

Also applies to: 174-176

207-255: Available-IDs helpers look fine.
Clear separation of network vs hardcoded sources; aligns with CLI list-available.

453-456: OVOS and Phonikud display_name usage: OK.
Placeholders will resolve in post_init; strings/readability are fine.

Also applies to: 466-475, 488-490

coderabbitai · 2025-10-25T09:58:15Z

+@cli.command(name="list-available")
+def list_available():
+    """
+    Lists all voice IDs available from upstream sources (Piper, Mimic3, etc.),
+    even if they are not yet in the local cache. (Requires network connection)
+    """
+    manager = TTSModelManager()
+
+    click.echo("Fetching all available voice IDs from upstream sources (Piper, Mimic3, OVOS, etc.)...")
+
+    try:
+        available_voices = manager.get_all_available_voice_ids()
+    except requests.exceptions.RequestException as e:
+        click.echo(f"\nError: Could not fetch available voice lists due to a network or connection error.", err=True)
+        click.echo(f"Details: {e}", err=True)
+        return
+    except Exception as e:
+        click.echo(f"An unexpected error occurred while fetching available voice lists: {e}", err=True)
+        return

+    total_voices = sum(len(ids) for ids in available_voices.values())
+    click.echo(f"\nTotal available voices found (from all sources): {total_voices}\n")
+
+    for source, voice_ids in available_voices.items():
+        click.echo(f"--- {source.upper()} Voices ({len(voice_ids)}) ---")
+        for voice_id in voice_ids:
+            click.echo(f"  {voice_id}")
+        click.echo("-" * 40)
+    click.echo("Hint: Use 'update-cache' to download the configuration for these voices, or use 'download <VOICE_ID>' to download a specific voice directly.")


⚠️ Potential issue | 🟠 Major

Fix error handling and messaging; align with actual download behavior.

Redundant RequestException catch: manager.get_all_available_voice_ids() already swallows network errors; remove or narrow.

Broad except Exception: drop or narrow to specific decode errors. [BLE001]

Line 132 uses an f-string without placeholders. [F541]

The final hint suggests direct download by ID, but download currently requires the ID to be in cache. Either (A) change the hint or (B) extend download to fetch upstream.

Apply within this hunk:

@@ - try: - available_voices = manager.get_all_available_voice_ids() - except requests.exceptions.RequestException as e: - click.echo(f"\nError: Could not fetch available voice lists due to a network or connection error.", err=True) - click.echo(f"Details: {e}", err=True) - return - except Exception as e: - click.echo(f"An unexpected error occurred while fetching available voice lists: {e}", err=True) - return + # Network errors are handled inside the manager; this should not raise. + available_voices = manager.get_all_available_voice_ids() @@ - click.echo("Hint: Use 'update-cache' to download the configuration for these voices, or use 'download <VOICE_ID>' to download a specific voice directly.") + click.echo("Hint: Run 'update-cache' to add these to your cache, then use 'download <VOICE_ID>'.")

Additionally, fix the extraneous f-string (no placeholders) on the error message if you keep the try/except:

- click.echo(f"\nError: Could not fetch available voice lists due to a network or connection error.", err=True) + click.echo("\nError: Could not fetch available voice lists due to a network or connection error.", err=True)

Option B (enhance command): extend download to fall back to upstream when not cached (see separate comment with diff).

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

@cli.command(name="list-available")

def list_available():

"""

Lists all voice IDs available from upstream sources (Piper, Mimic3, etc.),

even if they are not yet in the local cache. (Requires network connection)

"""

manager = TTSModelManager()

click.echo("Fetching all available voice IDs from upstream sources (Piper, Mimic3, OVOS, etc.)...")

try:

available_voices = manager.get_all_available_voice_ids()

except requests.exceptions.RequestException as e:

click.echo(f"\nError: Could not fetch available voice lists due to a network or connection error.", err=True)

click.echo(f"Details: {e}", err=True)

return

except Exception as e:

click.echo(f"An unexpected error occurred while fetching available voice lists: {e}", err=True)

return

total_voices = sum(len(ids) for ids in available_voices.values())

click.echo(f"\nTotal available voices found (from all sources): {total_voices}\n")

for source, voice_ids in available_voices.items():

click.echo(f"--- {source.upper()} Voices ({len(voice_ids)}) ---")

for voice_id in voice_ids:

click.echo(f" {voice_id}")

click.echo("-" * 40)

click.echo("Hint: Use 'update-cache' to download the configuration for these voices, or use 'download <VOICE_ID>' to download a specific voice directly.")

@cli.command(name="list-available")

def list_available():

"""

Lists all voice IDs available from upstream sources (Piper, Mimic3, etc.),

even if they are not yet in the local cache. (Requires network connection)

"""

manager = TTSModelManager()

click.echo("Fetching all available voice IDs from upstream sources (Piper, Mimic3, OVOS, etc.)...")

# Network errors are handled inside the manager; this should not raise.

available_voices = manager.get_all_available_voice_ids()

total_voices = sum(len(ids) for ids in available_voices.values())

click.echo(f"\nTotal available voices found (from all sources): {total_voices}\n")

for source, voice_ids in available_voices.items():

click.echo(f"--- {source.upper()} Voices ({len(voice_ids)}) ---")

for voice_id in voice_ids:

click.echo(f" {voice_id}")

click.echo("-" * 40)

click.echo("Hint: Run 'update-cache' to add these to your cache, then use 'download <VOICE_ID>'.")

🧰 Tools

🪛 Ruff (0.14.1)

132-132: f-string without any placeholders

Remove extraneous f prefix

(F541)

135-135: Do not catch blind exception: Exception

(BLE001)

coderabbitai · 2025-10-25T09:58:16Z

    phoneme_map_url: Optional[str] = None  # json lookup table for phoneme replacement
    config: Optional[VoiceConfig] = None
    phoneme_type: Optional[PhonemeType] = None
+    display_name: Optional[str] = None # NIEUW VELD


⚠️ Potential issue | 🟠 Major

Harden display_name formatting; coerce phoneme_type; broaden exceptions.
.format(...) can raise KeyError/ValueError; and phoneme_type may be a str from cache. Coerce and catch specific exceptions.

@@ - display_name: Optional[str] = None # NIEUW VELD + display_name: Optional[str] = None # New field @@ - if self.display_name and "{" in self.display_name: - try: - # Vul de placeholders in met de nu bekende engine en phoneme_type - self.display_name = self.display_name.format( - engine=self.engine.value, - phoneme_type=self.phoneme_type.value - ) - except AttributeError: - # Dit kan gebeuren als engine of phoneme_type None is (wat onwaarschijnlijk is na het laden van config) - LOG.warning(f"Kan display_name niet formatteren voor {self.voice_id}. Ontbrekende engine/phoneme_type.") - pass + # Ensure enum type when loaded from cache + if isinstance(self.phoneme_type, str): + try: + self.phoneme_type = PhonemeType(self.phoneme_type) + except ValueError: + LOG.warning(f"Unknown phoneme_type '{self.phoneme_type}' for {self.voice_id}; keeping as string.") + if self.display_name and "{" in self.display_name: + try: + self.display_name = self.display_name.format( + engine=self.engine.value, + phoneme_type=getattr(self.phoneme_type, "value", self.phoneme_type), + ) + except (AttributeError, KeyError, ValueError) as e: + LOG.warning(f"Could not format display_name for {self.voice_id}: {e}")

Also applies to: 56-67

🤖 Prompt for AI Agents

In phoonnx/model_manager.py lines ~27 and ~56-67, harden display_name formatting and ensure phoneme_type is coerced and exceptions broadened: before using display_name.format(...) validate/coerce inputs (e.g., ensure display_name is str), wrap the formatting call in a try/except that catches KeyError, ValueError and TypeError (not just a generic Exception), and fall back to a safe default format on failure; for phoneme_type, if it may be a string from cache, coerce it to the expected enum/type (e.g., map or call the enum with the string, or use a lookup) and handle invalid values gracefully (catch KeyError/ValueError/TypeError when coercing) so the code uses a safe fallback phoneme_type instead of failing.

coderabbitai · 2025-10-25T09:58:16Z

+    def _get_model_info_from_source(self, voice_id: str) -> Optional[TTSModelInfo]:
+        """
+        Genereert één TTSModelInfo object door de externe lijst te downloaden en te zoeken.
+        Dit is nodig als de cache nog niet geladen is.
+        """
+        if voice_id.startswith("piper_"):
+            url = "https://huggingface.co/rhasspy/piper-voices/resolve/main/voices.json"
+            engine = Engine.PIPER
+            prefix = "piper_"
+        elif voice_id.startswith("mimic3_"):
+            url = "https://raw.githubusercontent.com/MycroftAI/mimic3/refs/heads/master/mimic3_tts/voices.json"
+            engine = Engine.MIMIC3
+            prefix = "mimic3_"
+        else:
+            LOG.warning(f"Kan modelinfo niet genereren voor onbekende bron: {voice_id}")
+            return None
+
+        # Download de centrale stemlijst data
+        voice_list_data = self._fetch_voice_ids_from_url(url, "tijdelijke bron")
+
+        # Zoek naar de specifieke ID in de gedownloade lijst
+        base_id = voice_id.replace(prefix, "")
+        if base_id in voice_list_data:
+            model_data = voice_list_data[base_id]
+            display_name = self._generate_display_name(base_id, model_data, engine)
+
+            if engine == Engine.PIPER:
+                # voice_id bevat "piper_" prefix in dit geval
+                lang_code = model_data.get("language") or voice_id.split('-')[0] 
+            else: # Mimic3 of Fallback
+                lang_code = model_data.get("language", "und")
+
+            if engine == Engine.PIPER:
+                base = "https://huggingface.co/rhasspy/piper-voices/resolve/main/"
+                model_url = base + [a for a in model_data["files"] if a.endswith(".onnx")][0]
+                config_url = base + [a for a in model_data["files"] if a.endswith(".json")][0]
+                tokens_url = None
+                phoneme_map_url = None
+            elif engine == Engine.MIMIC3:
+                mimic_base = f"https://huggingface.co/mukowaty/mimic3-voices/resolve/main/voices/{base_id}"
+                model_url = f"{mimic_base}/generator.onnx"
+                config_url = f"{mimic_base}/config.json"
+                tokens_url = f"{mimic_base}/phonemes.txt"
+                phoneme_map_url = f"{mimic_base}/phoneme_map.txt"
+            else:
+                return None # Onbekende engine
+
+            return TTSModelInfo(
+                voice_id=voice_id,
+                lang=lang_code,
+                model_url=model_url,
+                config_url=config_url,
+                tokens_url=tokens_url,
+                phoneme_map_url=phoneme_map_url,
+                display_name=display_name
+            )
+        return None
+


⚠️ Potential issue | 🔴 Critical

Bug: wrong lang fallback for Piper; prefer next() for file picks.

Line 310 uses voice_id.split('-')[0] which yields piper_pt_PT for IDs like piper_pt_PT-.... Use base_id instead.

Prefer next(...) over indexing [0]. [RUF015]

@@ - if engine == Engine.PIPER: - # voice_id bevat "piper_" prefix in dit geval - lang_code = model_data.get("language") or voice_id.split('-')[0] + if engine == Engine.PIPER: + # base_id is "<lang>-<name>-<size>" + lang_code = model_data.get("language") or base_id.split('-')[0] @@ - model_url = base + [a for a in model_data["files"] if a.endswith(".onnx")][0] - config_url = base + [a for a in model_data["files"] if a.endswith(".json")][0] + model_url = base + next(a for a in model_data["files"] if a.endswith(".onnx")) + config_url = base + next(a for a in model_data["files"] if a.endswith(".json"))

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def _get_model_info_from_source(self, voice_id: str) -> Optional[TTSModelInfo]:

"""

Genereert één TTSModelInfo object door de externe lijst te downloaden en te zoeken.

Dit is nodig als de cache nog niet geladen is.

"""

if voice_id.startswith("piper_"):

url = "https://huggingface.co/rhasspy/piper-voices/resolve/main/voices.json"

engine = Engine.PIPER

prefix = "piper_"

elif voice_id.startswith("mimic3_"):

url = "https://raw.githubusercontent.com/MycroftAI/mimic3/refs/heads/master/mimic3_tts/voices.json"

engine = Engine.MIMIC3

prefix = "mimic3_"

else:

LOG.warning(f"Kan modelinfo niet genereren voor onbekende bron: {voice_id}")

return None

# Download de centrale stemlijst data

voice_list_data = self._fetch_voice_ids_from_url(url, "tijdelijke bron")

# Zoek naar de specifieke ID in de gedownloade lijst

base_id = voice_id.replace(prefix, "")

if base_id in voice_list_data:

model_data = voice_list_data[base_id]

display_name = self._generate_display_name(base_id, model_data, engine)

if engine == Engine.PIPER:

# voice_id bevat "piper_" prefix in dit geval

lang_code = model_data.get("language") or voice_id.split('-')[0]

else: # Mimic3 of Fallback

lang_code = model_data.get("language", "und")

if engine == Engine.PIPER:

base = "https://huggingface.co/rhasspy/piper-voices/resolve/main/"

model_url = base + [a for a in model_data["files"] if a.endswith(".onnx")][0]

config_url = base + [a for a in model_data["files"] if a.endswith(".json")][0]

tokens_url = None

phoneme_map_url = None

elif engine == Engine.MIMIC3:

mimic_base = f"https://huggingface.co/mukowaty/mimic3-voices/resolve/main/voices/{base_id}"

model_url = f"{mimic_base}/generator.onnx"

config_url = f"{mimic_base}/config.json"

tokens_url = f"{mimic_base}/phonemes.txt"

phoneme_map_url = f"{mimic_base}/phoneme_map.txt"

else:

return None # Onbekende engine

return TTSModelInfo(

voice_id=voice_id,

lang=lang_code,

model_url=model_url,

config_url=config_url,

tokens_url=tokens_url,

phoneme_map_url=phoneme_map_url,

display_name=display_name

)

return None

def _get_model_info_from_source(self, voice_id: str) -> Optional[TTSModelInfo]:

"""

Genereert één TTSModelInfo object door de externe lijst te downloaden en te zoeken.

Dit is nodig als de cache nog niet geladen is.

"""

if voice_id.startswith("piper_"):

url = "https://huggingface.co/rhasspy/piper-voices/resolve/main/voices.json"

engine = Engine.PIPER

prefix = "piper_"

elif voice_id.startswith("mimic3_"):

url = "https://raw.githubusercontent.com/MycroftAI/mimic3/refs/heads/master/mimic3_tts/voices.json"

engine = Engine.MIMIC3

prefix = "mimic3_"

else:

LOG.warning(f"Kan modelinfo niet genereren voor onbekende bron: {voice_id}")

return None

# Download de centrale stemlijst data

voice_list_data = self._fetch_voice_ids_from_url(url, "tijdelijke bron")

# Zoek naar de specifieke ID in de gedownloade lijst

base_id = voice_id.replace(prefix, "")

if base_id in voice_list_data:

model_data = voice_list_data[base_id]

display_name = self._generate_display_name(base_id, model_data, engine)

if engine == Engine.PIPER:

# base_id is "<lang>-<name>-<size>"

lang_code = model_data.get("language") or base_id.split('-')[0]

else: # Mimic3 of Fallback

lang_code = model_data.get("language", "und")

if engine == Engine.PIPER:

base = "https://huggingface.co/rhasspy/piper-voices/resolve/main/"

model_url = base + next(a for a in model_data["files"] if a.endswith(".onnx"))

config_url = base + next(a for a in model_data["files"] if a.endswith(".json"))

tokens_url = None

phoneme_map_url = None

elif engine == Engine.MIMIC3:

mimic_base = f"https://huggingface.co/mukowaty/mimic3-voices/resolve/main/voices/{base_id}"

model_url = f"{mimic_base}/generator.onnx"

config_url = f"{mimic_base}/config.json"

tokens_url = f"{mimic_base}/phonemes.txt"

phoneme_map_url = f"{mimic_base}/phoneme_map.txt"

else:

return None # Onbekende engine

return TTSModelInfo(

voice_id=voice_id,

lang=lang_code,

model_url=model_url,

config_url=config_url,

tokens_url=tokens_url,

phoneme_map_url=phoneme_map_url,

display_name=display_name

)

return None

🧰 Tools

🪛 Ruff (0.14.1)

316-316: Prefer next(...) over single element slice

Replace with next(...)

(RUF015)

317-317: Prefer next(...) over single element slice

Replace with next(...)

(RUF015)

🤖 Prompt for AI Agents

In phoonnx/model_manager.py around lines 282 to 339, fix two issues: replace voice_id.split('-')[0] with base_id when deriving the lang fallback for Piper so piper_pt_PT IDs yield "pt_PT" instead of the full prefixed string; and replace list comprehensions with indexing when picking ONNX/JSON files by using next((f for f in model_data["files"] if f.endswith(".onnx")), None) and similarly for ".json", then handle the None case (raise/return None or log and skip) so you avoid IndexError and follow the next(...) pattern.

coderabbitai · 2025-10-25T09:58:16Z

+    def download_voice_by_id(self, voice_id: str):
+        """
+        Downloadt de stem met de gegeven ID. Zoekt eerst in de cache, 
+        genereert anders het TTSModelInfo object via externe bronnen.
+        """
+        model_info = self.voices.get(voice_id)
+
+        if not model_info:
+            # Als de cache leeg is of de ID niet bevat, genereer deze dan van de bron
+            model_info = self._get_model_info_from_source(voice_id)
+
+            if not model_info:
+                LOG.error(f"Kan modelinformatie niet vinden of genereren voor ID: {voice_id}")
+                return False
+
+        # Roep de bestaande download_model methode aan, die een TTSModelInfo object verwacht
+        return model_info.download_model() # Gebruik de methode van TTSModelInfo
+


⚠️ Potential issue | 🟠 Major

download_voice_by_id returns None; return a boolean and persist when upstream.
Current method returns the result of download_model() which is None, making callers think it failed.

@@ - def download_voice_by_id(self, voice_id: str): + def download_voice_by_id(self, voice_id: str) -> bool: @@ - if not model_info: + if not model_info: # Als de cache leeg is of de ID niet bevat, genereer deze dan van de bron model_info = self._get_model_info_from_source(voice_id) @@ - if not model_info: + if not model_info: LOG.error(f"Kan modelinformatie niet vinden of genereren voor ID: {voice_id}") - return False + return False @@ - # Roep de bestaande download_model methode aan, die een TTSModelInfo object verwacht - return model_info.download_model() # Gebruik de methode van TTSModelInfo + try: + # Ensure config/tokens exist before model + model_info.download_config() + if model_info.tokens_url: + model_info.download_phoneme_map() + model_info.download_model() + # If this came from upstream, add to cache for future use + if voice_id not in self.voices: + self.add_voice(model_info) + self.cache.store() + return True + except requests.exceptions.RequestException as e: + LOG.error(f"Download failed for {voice_id}: {e}") + return False

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In phoonnx/model_manager.py around lines 340 to 357, download_voice_by_id currently returns the direct result of model_info.download_model() which is None; change it to return an explicit boolean and persist the model in the manager when creation/download succeeds. Specifically: if model_info is created from source, assign it into self.voices[voice_id]; call model_info.download_model() and treat None as success (or detect exceptions/falsey failures), return True on success and False on failure; log and return False on exceptions or when model_info could not be obtained. Ensure callers receive a boolean and the voice is persisted into self.voices when upstream creation succeeded.

coderabbitai · 2025-10-25T09:58:16Z

+            display_name="Proxectonos - Sabela)"
        ))


⚠️ Potential issue | 🟡 Minor

Typo in display_name: extra closing parenthesis.
Remove the trailing “)”.

- display_name="Proxectonos - Sabela)" + display_name="Proxectonos - Sabela"

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

display_name="Proxectonos - Sabela)"

))

display_name="Proxectonos - Sabela"

))

🤖 Prompt for AI Agents

In phoonnx/model_manager.py around lines 369 to 370, the display_name string contains an extra closing parenthesis "Proxectonos - Sabela)"; remove the trailing ")" so the value becomes "Proxectonos - Sabela" to correct the typo.

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (5)

phoonnx/model_manager.py (5)
56-67: Harden display_name formatting and exception handling.

The formatting logic only catches AttributeError but .format() can raise KeyError or ValueError. Additionally, phoneme_type may be a string when loaded from cache and needs coercion to the enum type.

As per past review comments, this issue was previously flagged.

310-310: Bug: Wrong variable for Piper language fallback.

Line 310 uses voice_id.split('-')[0] which yields "piper_pt_PT" for IDs like "piper_pt_PT-...". Use base_id.split('-')[0] instead to get just the language code.

As per past review comments, this issue was previously flagged.

316-317: Prefer next() over list indexing to avoid IndexError.

Using list comprehension with [0] indexing can raise IndexError if no matching files are found.

Apply this diff:
-                model_url = base + [a for a in model_data["files"] if a.endswith(".onnx")][0]
-                config_url = base + [a for a in model_data["files"] if a.endswith(".json")][0]
+                model_url = base + next(a for a in model_data["files"] if a.endswith(".onnx"))
+                config_url = base + next(a for a in model_data["files"] if a.endswith(".json"))
As per past review comments, this issue was previously flagged.

340-357: download_voice_by_id returns None; should return boolean and persist.

The method returns the result of download_model() which is None, making it appear to callers that the download failed. Additionally, when the model is created from upstream, it should be persisted to the cache.

As per past review comments, this issue was previously flagged.

369-369: Typo: Extra closing parenthesis in display_name.

Line 369 has an extra closing parenthesis: "Proxectonos - Sabela)" should be "Proxectonos - Sabela".

As per past review comments, this issue was previously flagged.

🧹 Nitpick comments (3)

phoonnx/model_manager.py (3)
27-27: Optional: Use cleaner inline documentation.

The comment # NEW FIELD is informal. Consider using a more descriptive docstring-style comment or removing it entirely since the field is self-explanatory.

193-206: Optional: Move return statement to else block for clarity.

Returning inside the try block (line 201) is less explicit than using an else block.

Apply this diff:
         try:
             response = requests.get(url, timeout=10)
             response.raise_for_status()
             voice_list_data = response.json()
-            return voice_list_data
-            
-        except requests.exceptions.RequestException as e:
+        except requests.exceptions.RequestException as e:
             LOG.error(f"Error fetching the {source_name} voice list: {e}")
             return {}
+        else:
+            return voice_list_data
256-280: Optional: Clarify parameter naming.

The first parameter is named voice_id but callers pass base_id (line 306) or voice_key (line 388). Consider renaming to voice_key or base_id for clarity, or update the docstring to clarify that it accepts the unprefixed identifier.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ea67295 and 2267717.

📒 Files selected for processing (1)

phoonnx/model_manager.py (11 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

phoonnx/model_manager.py (1)

phoonnx/config.py (2)

Engine (15-21)

PhonemeType (44-78)

🪛 Ruff (0.14.1)

phoonnx/model_manager.py

201-201: Consider moving this statement to an else block

(TRY300)

316-316: Prefer next(...) over single element slice

Replace with next(...)

(RUF015)

317-317: Prefer next(...) over single element slice

Replace with next(...)

(RUF015)

393-393: Prefer next(a for a in v["files"] if a.endswith(".onnx")) over single element slice

Replace with next(a for a in v["files"] if a.endswith(".onnx"))

(RUF015)

394-394: Prefer next(a for a in v["files"] if a.endswith(".json")) over single element slice

Replace with next(a for a in v["files"] if a.endswith(".json"))

(RUF015)

🔇 Additional comments (5)

phoonnx/model_manager.py (5)

162-163: LGTM!

The display_name field is correctly persisted in both save() and add_voice() methods.

Also applies to: 174-175

207-254: LGTM!

The voice ID listing methods are well-structured. Piper and Mimic3 fetch from external sources, while OVOS, Proxectonos, and Phonikud use hardcoded lists. The aggregator method provides a clean interface.

414-423: LGTM!

The display_name is correctly generated and assigned for Mimic3 voices.

447-456: LGTM!

The display_name generation for OVOS voices is correctly implemented, using voice names with placeholders for engine and phoneme_type.

Also applies to: 466-474

488-489: LGTM!

The display_name for the Phonikud voice is correctly assigned.

coderabbitai · 2025-10-25T21:27:17Z

                    model_url=base + [a for a in v["files"] if a.endswith(".onnx")][0],
                    config_url=base + [a for a in v["files"] if a.endswith(".json")][0],


⚠️ Potential issue | 🟠 Major

Prefer next() over list indexing to avoid IndexError.

Using list comprehension with [0] indexing can raise IndexError if no matching files are found.

Apply this diff:

voice = TTSModelInfo( voice_id="piper_" + v["key"], lang=standardize_tag(v["key"].split("-")[0]), - model_url=base + [a for a in v["files"] if a.endswith(".onnx")][0], - config_url=base + [a for a in v["files"] if a.endswith(".json")][0], + model_url=base + next(a for a in v["files"] if a.endswith(".onnx")), + config_url=base + next(a for a in v["files"] if a.endswith(".json")), display_name=display_name )

🧰 Tools

🪛 Ruff (0.14.1)

393-393: Prefer next(a for a in v["files"] if a.endswith(".onnx")) over single element slice

Replace with next(a for a in v["files"] if a.endswith(".onnx"))

(RUF015)

394-394: Prefer next(a for a in v["files"] if a.endswith(".json")) over single element slice

Replace with next(a for a in v["files"] if a.endswith(".json"))

(RUF015)

🤖 Prompt for AI Agents

In phoonnx/model_manager.py around lines 393 to 394, the code uses list indexing [0] on comprehensions which can raise IndexError; replace each with next((a for a in v["files"] if a.endswith(".onnx")), None) and next((a for a in v["files"] if a.endswith(".json")), None) respectively, then check for None and raise a clear exception or log an error before constructing model_url/config_url as base + filename so missing files are handled gracefully.

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (5)

phoonnx/model_manager.py (5)

27-27: Display name formatting needs hardening; coerce phoneme_type from cache.

The .format() call can raise KeyError or ValueError if placeholders are missing or malformed. Additionally, phoneme_type loaded from cache may be a string and needs coercion to the PhonemeType enum before accessing .value.

This issue was previously flagged. The current implementation only catches AttributeError but should also handle KeyError and ValueError. Before formatting, check if phoneme_type is a string and coerce it to the enum type.

Also applies to: 56-67

380-400: Prefer next() over list indexing to avoid IndexError.

Lines 393-394 use list indexing [0] which can raise IndexError if no matching files are found.

Apply this diff:

                 voice = TTSModelInfo(
                     voice_id="piper_" + v["key"],
                     lang=standardize_tag(v["key"].split("-")[0]),
-                    model_url=base + [a for a in v["files"] if a.endswith(".onnx")][0],
-                    config_url=base + [a for a in v["files"] if a.endswith(".json")][0],
+                    model_url=base + next((a for a in v["files"] if a.endswith(".onnx")), ""),
+                    config_url=base + next((a for a in v["files"] if a.endswith(".json")), ""),
                     display_name=display_name
                 )
+                if not voice.model_url or not voice.config_url:
+                    LOG.warning(f"Missing files for {v['key']}, skipping")
+                    continue
                 self.add_voice(voice)

282-339: Bug: wrong lang fallback for Piper; prefer next() for file selection.

Two issues remain from previous review:

Line 310 uses voice_id.split('-')[0] which yields piper_pt_PT for IDs like piper_pt_PT-.... Use base_id.split('-')[0] instead.
Lines 316-317 use list indexing [0] which raises IndexError if no matching files exist. Prefer next(...) pattern.

Apply this diff:

             if engine == Engine.PIPER:
-                lang_code = model_data.get("language") or voice_id.split('-')[0] 
+                lang_code = model_data.get("language") or base_id.split('-')[0]
             else: # Mimic3 or Fallback
                 lang_code = model_data.get("language", "und")
 
             if engine == Engine.PIPER:
                 base = "https://huggingface.co/rhasspy/piper-voices/resolve/main/"
-                model_url = base + [a for a in model_data["files"] if a.endswith(".onnx")][0]
-                config_url = base + [a for a in model_data["files"] if a.endswith(".json")][0]
+                model_url = base + next((a for a in model_data["files"] if a.endswith(".onnx")), "")
+                config_url = base + next((a for a in model_data["files"] if a.endswith(".json")), "")
+                if not model_url or not config_url:
+                    LOG.error(f"Missing required files for {voice_id}")
+                    return None
                 tokens_url = None
                 phoneme_map_url = None

340-357: download_voice_by_id returns None; should return boolean and persist upstream voices.

The method returns the result of download_model() which is None, making it unclear if the download succeeded. Additionally, voices created from upstream sources are not added to self.voices or persisted to cache.

Apply this diff:

-    def download_voice_by_id(self, voice_id: str):
+    def download_voice_by_id(self, voice_id: str) -> bool:
         """
         Downloads the voice with the given ID. Searches in the cache first,
         otherwise generates the TTSModelInfo object via external sources.
         """
         model_info = self.voices.get(voice_id)
         
         if not model_info:
             # If the cache is empty or doesn't contain the ID, generate it from the source
             model_info = self._get_model_info_from_source(voice_id)
             
             if not model_info:
                 LOG.error(f"Cannot find or generate model information for ID: {voice_id}")
                 return False
+            # Add newly created voice to cache
+            self.add_voice(model_info)
 
-        # Call the existing download_model method, which expects a TTSModelInfo object
-        return model_info.download_model() # Use the method from TTSModelInfo
+        try:
+            # Download all components
+            model_info.download_config()
+            if model_info.tokens_url:
+                model_info.download_phoneme_map()
+            model_info.download_model()
+            return True
+        except requests.exceptions.RequestException as e:
+            LOG.error(f"Download failed for {voice_id}: {e}")
+            return False

369-369: Typo: extra closing parenthesis in display_name.

Remove the trailing ")" from the display_name string.

Apply this diff:

-            display_name="Proxectonos - Sabela)"
+            display_name="Proxectonos - Sabela"

🧹 Nitpick comments (1)

phoonnx/model_manager.py (1)
194-206: Consider moving return to else block for clarity.

The return voice_list_data statement at line 201 could be moved to an else block to make the control flow more explicit and align with static analysis recommendations.

Apply this diff:
         except requests.exceptions.RequestException as e:
             LOG.error(f"Error fetching the {source_name} voice list: {e}")
             return {}
+        else:
+            return voice_list_data
-        
-        return voice_list_data

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2267717 and 1f04111.

📒 Files selected for processing (1)

phoonnx/model_manager.py (11 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

phoonnx/model_manager.py (1)

phoonnx/config.py (2)

Engine (15-21)

PhonemeType (44-78)

🪛 Ruff (0.14.3)

phoonnx/model_manager.py

201-201: Consider moving this statement to an else block

(TRY300)

316-316: Prefer next(...) over single element slice

Replace with next(...)

(RUF015)

317-317: Prefer next(...) over single element slice

Replace with next(...)

(RUF015)

393-393: Prefer next(a for a in v["files"] if a.endswith(".onnx")) over single element slice

Replace with next(a for a in v["files"] if a.endswith(".onnx"))

(RUF015)

394-394: Prefer next(a for a in v["files"] if a.endswith(".json")) over single element slice

Replace with next(a for a in v["files"] if a.endswith(".json"))

(RUF015)

🔇 Additional comments (3)

phoonnx/model_manager.py (3)

256-280: LGTM!

The display name generation logic is well-structured and handles different engines appropriately with sensible fallbacks.

401-430: LGTM!

The integration of display_name generation for Mimic3 voices is well-implemented.

431-491: LGTM!

The display_name integration for OVOS and Phonikud voices is consistent and well-formatted.

timonvanhasselt added 2 commits October 25, 2025 11:47

Add command to list available voice IDs

ea67295

Added a new command to list available voice IDs (phoonnx-voices list-available)

coderabbitai Bot reviewed Oct 25, 2025

View reviewed changes

JarbasAl reviewed Oct 25, 2025

View reviewed changes

Comment thread phoonnx/model_manager.py Outdated

Translated comments

2267717

coderabbitai Bot reviewed Oct 25, 2025

View reviewed changes

Add UTF-8 encoding to model config file read

1f04111

coderabbitai Bot reviewed Nov 5, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Display name, voice ID list option#64

Add Display name, voice ID list option#64
timonvanhasselt wants to merge 4 commits into
TigreGotico:devfrom
timonvanhasselt:dev

timonvanhasselt commented Oct 25, 2025 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Oct 25, 2025 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Oct 25, 2025

Uh oh!

coderabbitai Bot Oct 25, 2025

Uh oh!

coderabbitai Bot Oct 25, 2025

Uh oh!

coderabbitai Bot Oct 25, 2025

Uh oh!

coderabbitai Bot Oct 25, 2025

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Oct 25, 2025

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		model_url=base + [a for a in v["files"] if a.endswith(".onnx")][0],
		config_url=base + [a for a in v["files"] if a.endswith(".json")][0],

Conversation

timonvanhasselt commented Oct 25, 2025 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Oct 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Oct 25, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Oct 25, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Oct 25, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Oct 25, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Oct 25, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Oct 25, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

timonvanhasselt commented Oct 25, 2025 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Oct 25, 2025 •

edited

Loading