Skip to content

Running inference over a large batch of audio files #22

Closed
@vmedappa

Description

Hi! Firstly, thank you so much for this incredible work!

I have been running the tiny.en models on a large number of wav files stored in a folder. I am currently parallelizing the work over a multi-core machine using GNU parallel and running the following command :

find input_data/eng_wav_data -name "*.wav" | parallel 'time ./main -m models/ggml-tiny.en.bin -nt -f {} -t 1 > {.}.txt'

I found that currently the model is loaded each time we have to transcribe a wav file. Is there a way I can circumvent this and load the model only once? Any help would be appreciated. Thank you. Apologies if this issue has been resolved already

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions