Announcement: Removing support for non-TFLite backends in STT packages #1922

reuben · 2021-07-28T12:12:37Z

reuben
Jul 28, 2021
Maintainer

Hello all,

PR #1921 is making some significant changes to which model types are supported by Coqui STT packages. The gist of the PR is that only TFLite model inputs will be supported. This means model files ending in .pb or .pbmm will not be loaded by 🐸STT after this PR and follow-up PRs are merged and the next version is released. If you're using a currently released version of 🐸STT, up until 0.10.0-alpha.10, nothing will change for you until you upgrade.

We're making this change because the protobuf (and memory-mapped protobuf) inputs are hacky at best (with TensorFlow v1) and deprecated/unsupported (with TensorFlow v2). They increase our maintenance burden, slow down builds (which slows down CI, which slows down development), force us to have a hacky build system integration into TensorFlow, and so on. On top of this, 🐸STT packages have always been focused on on-device, low latency speech recognition, with excellent support for low power devices with constrained resources. This means that the TFLite backend has also always been the better choice for how to deploy 🐸STT.

Here is a summary of main changes:

All 🐸STT packages will accept the same input format (.tflite models).
There will not be multiple package versions (e.g. stt, stt-tflite, stt-gpu), but just a single stt package.
Trying to load a protobuf (.pb or .pbmm) model will result in an error on the API level, which you can detect and handle.
We'll start publishing both quantized and non-quantized TFLite models. Non-quantized TFLite models should be equivalent in accuracy (but also in increased file size) to protobuf models.

This change will bring lots of benefits to maintainability and speed of development for 🐸STT, and will also make it a bit easier for us to upgrade to TensorFlow 2, a long-standing issue which has been made harder to tackle due to supporting protobuf models, and finally it should also make it easier to contribute to the code base.

If you have any questions or concerns, or if you think you have a use-case that can only be covered by the protobuf model formats, please post here so we can find a solution.

Best,
-- reuben

Q/A section:

But what about GPU support: our GPU-enabled packages have always been a bit of an awkward offering: despite supporting GPU-acceleration, the packages and the API have never supported batching of streams, which is crucial to make full use of GPU resources. Additionally, the main use-case where GPUs are available is for server-based deployment, which is not something we've optimized or built tooling for, meaning one still had to create the surrounding infrastructure to handle remote requests. We recommend using the GPU-enabled TensorFlow Python package together with the coqui_stt_ctcdecoder package to implement GPU-accelerated inference directly from model checkpoints. We have two places in our training code doing this, which you can use for inspiration: here and here. Improvements to make checkpoints simpler to load from an API/CLI/configuration perspective are more than welcome.
I only work with checkpoints, does this affect me?: There's a group of folks who use the Python code (e.g. train.py) for training and also use the Python code for inference (e.g. evaluate.py), without ever creating a protobuf (.pb or .pbmm) model. If you don't use protobuf files, then you probably only work with checkpoints, and these changes shouldn't affect you.
More questions will be added.

comodoro · 2021-08-06T14:32:59Z

comodoro
Aug 6, 2021

So can I export a non-quantized tflite model at the moment?

2 replies

reuben Aug 6, 2021
Maintainer Author

Right now there's not a flag (thanks for the reminder), but you can replace this line:

STT/training/coqui_stt_training/train.py

Line 1119 in 285b524

converter.optimizations = [tf.lite.Optimize.DEFAULT]

With this:

converter.optimizations = []

To do a non-quantized export. I'll add a CLI flag to accomplish that.

JRMeyer Nov 1, 2021
Maintainer

CLI flag for non-quantized vs quantized TFlite exports added here: #1938

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Announcement: Removing support for non-TFLite backends in STT packages #1922

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 2 replies

{{title}}

{{title}}

{{title}}

Select a reply

Announcement: Removing support for non-TFLite backends in STT packages #1922

reuben Jul 28, 2021 Maintainer

Q/A section:

Replies: 1 comment · 2 replies

comodoro Aug 6, 2021

reuben Aug 6, 2021 Maintainer Author

JRMeyer Nov 1, 2021 Maintainer

reuben
Jul 28, 2021
Maintainer

Replies: 1 comment 2 replies

comodoro
Aug 6, 2021

reuben Aug 6, 2021
Maintainer Author

JRMeyer Nov 1, 2021
Maintainer