Skip to content

Speech Note 4.5.0

Compare
Choose a tag to compare
@mkiol mkiol released this 18 May 16:49
· 157 commits to main since this release

Linux Desktop

Changes:

  • User Interface
    • Import subtitles embedded into video file. If your video file contains one or many subtitle streams, you can import the selected subtitles into notepad
    • Support for more subtitles formats. You can import and export subtitles in SRT, WebVTT and ASS formats.
    • Unified file importing and exporting. Text, subtitles, audio and video files can be imported or exported using unified menu bar option.
    • Settings option to enable/disable remembering the last note. If the option is disabled, the last note will not be available after restarting the app.
    • Settings option for default action when importing note from a file. You can set Ask whether to add or replace, Add to an existing note or Replace an existing note.
    • Enhanced text editor font settings. You can set the font family, style and size of the font used in the text editor.
    • Text to Text repair options. With these options you can directly fix diacritical marks and punctuation in the text.
    • Text context menu with additional options: Read selection and Translate selection. To activate context menu use mouse right click.
    • New text appending style: After empty line
    • System tray menu for changing active STT/TTS model
    • User friendly names of audio input devices
    • Simplified model filtering. It is now less flexible, but much easier to understand and use.
    • Speech Note has been translated into Ukrainian and Russian languages.
    • Fix: Cancellation was blocking the user interface.
  • Speech to Text
    • Updated Distil model for English: Distil Large-v3. New model is enabled for Whisper and Faster Whisper engines.
    • New Fine-Tuned Whisper models for Slovenian and Polish
    • Fix: Punctuation model could not be downloaded.
  • Text to Speech
    • WhisperSpeech engine that generates voice with exceptional naturalness. The new engine comes with models for English and Polish languages. All models support voice cloning.
    • New voice cloning model for Vietnamese: viXTTS. Model is a fine-tuned version of the phenomenal Coqui XTTS.
    • New Piper voices for English, Persian, Slovenian, Turkish, French and Spanish
    • New RHVoice voice for Czech
    • Settings option to enable/disable speech synchronization with subtitle timestamps. This may be useful for creating voice overs.
    • Mixing speech with audio from an existing file. When exporting to a file, you can overlay speech with audio from an existing media file. This can be useful when creating voice overs from subtitles.
    • Context menu option to read from cursor position or read only selected text. To activate context menu use mouse right click.
    • Speech audio is always normalized after TTS processing.
    • Fix: Mimic3 models could not be downloaded.
  • Translator
    • New models: Greek to English, Maltese to English, Slovenian to English, Turkish to English, English to Catalan
    • Updated models: Czech and Lithuanian
    • Handy buttons to quickly add translated text to the note or to replace it and switch languages
    • Context menu option to translate from cursor position or translate only selected text. To activate context menu use mouse right click.
  • Accessibility
    • New Actions for STT/TTS models switching: switch-to-next-stt-model, switch-to-prev-stt-model, switch-to-next-tts-model, switch-to-prev-tts-model, set-stt-model, set-tts-model
    • New global keyboard shortcuts for STT/TTS models switching (X11 only): Switch to next STT model, Switch to prev STT model, Switch to next TTS model, Switch to prev TTS model
    • Toggle option for keyboard shortcuts (X11 only). When Toggle behavior is enabled, Start listening/reading shortcuts will also stop listening/reading if they are triggered while listening/reading is active.
    • Fix: Accented characters (e.g.: ã, ê) were not transferred correctly to the active window.
  • Flatpak
    • Flatpak runtime update to version 5.15-23.08
    • AMD ROCm update to version 5.7.3
    • PyTorch update to version 2.2.1
    • CTranslate2 update to version 4.2.1
    • Faster-Whisper update to version 1.0.2

A video demonstration of all the changes in 4.5.0: https://www.youtube.com/watch?v=S9MJ7y8-bcw

Sailfish OS

Changes:

  • User Interface
    • Import subtitles in many formats and subtitles embedded into video file. You can import and export subtitles in SRT, WebVTT and ASS formats. If your video file contains one or many subtitle streams, you can import the selected subtitles into notepad.
    • Unified file importing and exporting. Text, subtitles, audio and video files can be imported or exported using unified pull-down menu option.
    • Settings option to enable/disable remembering the last note. If the option is disabled, the last note will not be available after restarting the app.
    • Settings option for default action when importing note from a file. You can set Ask whether to add or replace, Add to an existing note or Replace an existing note.
    • New text appending style: After empty line
    • Speech Note has been translated into Ukrainian and Russian languages.
    • Fix: Cancellation was blocking the user interface.
  • Speech to Text
    • Subtitles support in STT. To generate timestamped text in SRT format, change the text format to SRT Subtitles using the button at the bottom of the text area. Check the settings to find more subtitle options.
  • Text to Speech
    • Speech synchronized with subtitle timestamps in TTS. When the text format is set to SRT Subtitles, the generated speech will be synchronized with the subtitle timestamps. This can be useful if you want to make voice over.
    • New Piper voices for English, Persian, Slovenian, Turkish, French and Spanish
    • New RHVoice voice for Czech
    • Settings option to enable/disable speech synchronization with subtitle timestamps.
    • Speech audio is always normalized after TTS processing.
  • Translator
    • New models: Greek to English, Maltese to English, Slovenian to English, Turkish to English, English to Catalan
    • Updated models: Czech and Lithuanian