Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stream API issues with threads #17

Open
parasyte opened this issue Jan 25, 2023 · 0 comments
Open

Stream API issues with threads #17

parasyte opened this issue Jan 25, 2023 · 0 comments

Comments

@parasyte
Copy link

parasyte commented Jan 25, 2023

There are a few problems. First, the threads example has been broken since 8de3237:

error[E0308]: mismatched types
  --> examples\threads.rs:54:37
   |
54 |     let stream = Stream::from_model(Arc::clone(&model)).expect("failed to create stream");
   |                  ------------------ ^^^^^^^^^^^^^^^^^^
   |                  |                  |
   |                  |                  expected `&mut Model`, found struct `Arc`
   |                  |                  help: consider mutably borrowing here: `&mut Arc::clone(&model)`
   |                  arguments to this function are incorrect
   |
   = note: expected mutable reference `&mut Model`
                         found struct `Arc<Model>`
note: associated function defined here
  --> C:\Users\jay\other-projects\coqui-stt\src\stream.rs:38:12
   |
38 |     pub fn from_model(model: &'a mut Model) -> crate::Result<Stream<'a>> {
   |            ^^^^^^^^^^

The commit fixes a UB problem, which is great, but it also creates a new one: Stream takes an exclusive reference to Model, but Model cannot outlive the stack frame without heap allocating and leaking the box.

So, i.e., this will work, but it is not ideal:

diff --git a/examples/threads.rs b/examples/threads.rs
index f592442..63edc79 100644
--- a/examples/threads.rs
+++ b/examples/threads.rs
@@ -9,7 +9,6 @@ use std::env::args;
 use std::fs::File;
 use std::path::Path;
 use std::sync::mpsc::channel;
-use std::sync::Arc;

 fn main() {
     // this is copied and pasted from the basic_usage example
@@ -40,6 +39,7 @@ fn main() {
     }

     let mut m = Model::new(model_name.to_str().expect("invalid utf-8 found in path")).unwrap();
+    let sample_rate = m.get_sample_rate();
     // enable external scorer if found in the model folder
     if let Some(scorer) = scorer_name {
         let scorer = scorer.to_str().expect("invalid utf-8 found in path");
@@ -48,10 +48,7 @@ fn main() {
     }

     // create a Stream
-    // wrap the Model in an Arc: note this makes the model immutable forever, so do any changes to its options before doing this!
-    let model = Arc::new(m);
-    // create the Stream
-    let stream = Stream::from_model(Arc::clone(&model)).expect("failed to create stream");
+    let stream = Stream::from_model(Box::leak(Box::new(m))).expect("failed to create stream");
     // you can do this construction anywhere
     // here, we'll do it in the main thread and send the stream to a background thread

@@ -76,7 +73,7 @@ fn main() {

     let src_sample_rate = desc.sample_rate();
     // keep in mind this is in an Arc, so this is immutable now
-    let dest_sample_rate = model.get_sample_rate() as u32;
+    let dest_sample_rate = sample_rate as u32;
     // Obtain the buffer of samples
     let mut audio_buf: Vec<_> = if src_sample_rate == dest_sample_rate {
         reader.samples().map(|s| s.unwrap()).collect()

Another way to do this without leaking is moving the Model into the thread first, and then creating the stream. But this does not work with the cpal API, which uses callbacks (much to my chagrin):

    // TODO: We have to leak the model to extend its lifetime to 'static.
    // It's the only way to satisfy the `coqui-stt` API with cpal callbacks.
    let mut stream = Stream::from_model(Box::leak(model)).unwrap();

    device
        .build_input_stream(
            &config,
            move |data: &[i16], _: &InputCallbackInfo| {
                stream.feed_audio(data);

                if let Ok(text) = stream.intermediate_decode() {
                    tx.try_send(text).unwrap();
                }
            },
            move |err| {
                panic!("cpal Error: {err}");
            },
        )
        .unwrap();

I should also mention yet another way to address this without leaking is with more channels. One Sender/Receiver pair for piping the cpal input stream to a thread that owns the Model and Stream. And a second channel pair to pipe the transcribed text to its final destination.

If possible, Stream could own the Model instead of borrowing.


Unfortunately, the transcription quality is not great with coqui, so I don't think we can use it. But since I noticed this issue while evaluating, I figured I would report my findings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant