Skip to content

fix: auto-download dicta ONNX model for Hebrew diacritization#512

Open
voidborne-d wants to merge 1 commit intoresemble-ai:masterfrom
voidborne-d:fix/hebrew-diacritization-model-path
Open

fix: auto-download dicta ONNX model for Hebrew diacritization#512
voidborne-d wants to merge 1 commit intoresemble-ai:masterfrom
voidborne-d:fix/hebrew-diacritization-model-path

Conversation

@voidborne-d
Copy link
Copy Markdown

Summary

Fixes #467 — Hebrew diacritization silently fails because add_hebrew_diacritics() calls Dicta() with no arguments, but dicta_onnx.Dicta.__init__ requires a model_path argument.

Problem

dicta_onnx.Dicta requires a path to a separately-downloaded ONNX model file (~300 MB). The current code:

_dicta = Dicta()  # TypeError: missing required argument 'model_path'

The TypeError is caught by the except Exception handler and logged as a warning. Hebrew TTS receives bare consonants with no niqqud, producing gibberish speech output.

Fix

  1. Auto-download model: New _get_dicta_model_path() function auto-downloads the int8 ONNX model from the official dicta-onnx GitHub release on first use, caching it under $XDG_CACHE_HOME/chatterbox/dicta/ (or ~/.cache/chatterbox/dicta/)
  2. Env-var override: DICTA_MODEL_PATH lets users point to a local .onnx file (useful for airgapped/Docker environments)
  3. Atomic write: Uses tmpfile + os.replace so a partial download never poisons the cache
  4. Better warnings: Tell users exactly how to install dicta-onnx or set the env-var, instead of a generic 'failed' message

Tests

17 regression tests in tests/test_hebrew_diacritization.py covering:

All tests are lightweight (no GPU, no real model download, no torch/torchaudio dependency).

python3 -m pytest tests/test_hebrew_diacritization.py -v
17 passed in 0.07s

…le-ai#467)

`add_hebrew_diacritics()` called `Dicta()` with no arguments, but
`dicta_onnx.Dicta.__init__` requires a `model_path` argument pointing
to an ONNX model file.  The resulting `TypeError` was swallowed by the
`except Exception` handler, so Hebrew TTS silently received un-voweled
text and produced gibberish speech output.

Changes:
- Add `_get_dicta_model_path()` that auto-downloads the int8 ONNX model
  (~300 MB) from the official dicta-onnx GitHub release on first use,
  caching it under `$XDG_CACHE_HOME/chatterbox/dicta/` (or
  `~/.cache/chatterbox/dicta/`)
- Support `DICTA_MODEL_PATH` env-var to override auto-download with a
  local .onnx file (for airgapped/Docker environments)
- Atomic write (tmpfile + os.replace) prevents partial downloads from
  poisoning the cache
- Improved warning messages: tell users exactly how to install dicta-onnx
  or set the env-var, instead of a generic 'failed' message
- 17 regression tests covering env-var override, cache creation, download,
  cache hit, partial cleanup, model_path passing to Dicta(), missing
  dependency warnings, source-code audit (no bare Dicta() calls), and
  full round-trip integration
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Multilingual Hebrew diacritization silently fails — dicta_onnx called without required model_path

1 participant