Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync/tts cache #15

Merged
merged 1 commit into from
Nov 18, 2021
Merged

Sync/tts cache #15

merged 1 commit into from
Nov 18, 2021

Conversation

JarbasAl
Copy link
Member

@JarbasAl JarbasAl commented Nov 17, 2021

port the new handling of TTS cache from mycroft-core

I am not exactly a fan of the new approach (1 class was enough.....) but for backwards compat everything was ported, I slightly augmented the api (and hopefully usefulness) of the new classes and removed everything mimic2 specific

Cleaned up the TTS class as much as possible, _execute was getting huge so i split it into logical smaller functions. Previously existing cache methods were marked for deprecation in mycroft-core but here are integrated with the new code and won't be deprecated

Also ported the RemoteTTS class from ovos-core, it is not very useful but allows for full deprecation of tts module in ovos-core

Creating the persistent cache files is out of scope and depends on the engine

companion PR OpenVoiceOS/ovos-core#16

port and improve the new TTS cache from mycroft-core

refactor the playback thread to make it more readable

solve TODO for pause/resume functionality
@JarbasAl JarbasAl merged commit 4592888 into master Nov 18, 2021
JarbasAl added a commit to OpenVoiceOS/ovos-core that referenced this pull request Nov 18, 2021
bumps ovos plugin manager to 0.0.3a1 and removes duplicted code

latest mycroft-core tts cache implementation ported in OpenVoiceOS/ovos-plugin-manager#15

psutil is no longer a mandatory requirement

authored-by: jarbasai <jarbasai@mailfence.com>
Copy link
Member

@NeonDaniel NeonDaniel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left comments, but they are probably more applicable to the Mycroft implementation than this PR. Will likely implement PlaybackThread differently in Neon

try:
if len(self._now_playing) == 5:
# new mycroft style
snd_type, data, visemes, ident, listen = self._now_playing
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason this is a tuple and not a dict or another more explicit structure?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its what mycroft uses and we need backwards compat, i can improve the code but the user facing api can't change only be augmented

self.lang = lang or config.get("lang") or 'en-us'
self.config = config or {}
self.validator = validator or TTSValidator(self)
self.phonetic_spelling = phonetic_spelling
self.audio_ext = audio_ext
self.ssml_tags = ssml_tags or []
self.log_timestamps = self.config.get("log_timestamps", False)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What timestamps? Could this be annotated or renamed for clarity?

Copy link
Member Author

@JarbasAl JarbasAl Nov 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the default metrics handler, all TTS plugins have this option

see these debug logs for example (debugging TTS not happening, note the timestamps in playback step)

2021-11-17 20:42:08.042 - OVOS - mycroft.audio.speech:mute_and_speak:130 - INFO - Speak: Please wait a moment as I finish booting up.
2021-11-17 20:42:08.043 - OVOS - ovos_plugin_manager.templates.tts:handle_metric:249 - DEBUG - time delta: 0 metric: {'metric_type': 'tts.ssml.validated'}
2021-11-17 20:42:08.049 - OVOS - ovos_plugin_manager.templates.tts:handle_metric:249 - DEBUG - time delta: 0.00035881996154785156 metric: {'metric_type': 'tts.preprocessed', 'n_chunks': 1}
2021-11-17 20:42:08.050 - OVOS - ovos_plugin_manager.templates.tts:handle_metric:249 - DEBUG - time delta: 0.0017445087432861328 metric: {'metric_type': 'tts.synth.start'}
2021-11-17 20:42:08.242 - OVOS - ovos_plugin_manager.templates.tts:handle_metric:249 - DEBUG - time delta: 0.194044828414917 metric: {'metric_type': 'tts.synth.finished'}
2021-11-17 20:42:08.243 - OVOS - ovos_plugin_manager.templates.tts:handle_metric:249 - DEBUG - time delta: 0.19539237022399902 metric: {'metric_type': 'tts.synth.cached'}
2021-11-17 20:42:08.246 - OVOS - ovos_plugin_manager.templates.tts:handle_metric:249 - DEBUG - time delta: 0.19644379615783691 metric: {'metric_type': 'tts.queued'}
2021-11-17 20:42:08.249 - OVOS - ovos_plugin_manager.templates.tts:handle_metric:249 - DEBUG - time delta: 0.20165657997131348 metric: {'metric_type': 'tts.start'}
2021-11-17 20:42:08.254 - OVOS - ovos_plugin_manager.templates.tts:handle_metric:249 - DEBUG - time delta: 0.20674896240234375 metric: {'metric_type': 'tts.end'}

self.handle_metric({"metric_type": "tts.queued"})

def _determine_ext(self, audio_file):
# determine audio_ext on the fly
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A potentially better approach here is to read it from file headers in case the plugin incorrectly specifies (i.e. a remote TTS implementation changes)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i dont think plugins should specify it at all to be honest, but that needs a larger refactor. and mycroft is also talking about the same so lets wait and see.... mycroft intends to deprecate allowing TTS to return file path, so there will be breakage either way....

the aim here is to allow plugins to return different extensions per request, i had this happening when i added a permanent cache to polly that was in .wav format, but the plugin returns mp3.

except:
return self.audio_ext

def _synth(self, sentence, sentence_hash=None, **kwargs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Method annotation?

@JarbasAl JarbasAl deleted the sync/tts_cache branch February 3, 2022 00:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants