media-to-whisper.cpp-subtitles

Converting user-specified media files to subtitles using the Whisper.cpp utility.

https://gitlab.com/brlin/media-to-whisper.cpp-subtitles

Prerequisites

You need to have the following software installed and it's command available in the command search PATHs:

GNU core utilities
For determining the absolute path of the utility and the available threads to do the subtitle inference.
whisper.cpp
For inferencing the subtitles from the media's audio tracks.

By default it uses the unofficial snap distribution of Whisper.cpp.
FFmpeg
For converting the input media into formats that can be consumed by whisper.cpp.

Usage

Refer to the following instructions to use this application:

Download the application's release package from the Releases page.
Extract the downloaded application release package using your preferred archive manipulation utility.
Launch your preferred text terminal emulator application.
Refer to the Environment variables that can change the utility's behaviors section for environment variables that can change the utility's behaviors according to your preference and run the utility by running the following command:
```
_ENV_VAR_NAME1_=_env_var_value1_ _ENV_VAR_NAME2_=_env_var_value2_... \
    /path/to/media-to-whisper.cpp-subtitles/transcribe-media-to-subtitles.sh \
    _input_media_1_ _input_media_2_...
```
The generated subtitles files will be saved in the same directory with the input files.

Environment variables that can change the utility's behaviors

The following environment variables can change the utility's behaviors according to your preference, use these environment variables to optimize your workload:

GGML_MODEL

The whisper.cpp model file to infer the subtitles.

Supported values:

(Absolute path of the model file you want to use)
(Relative path of the model file you want to use)

Default value: ggml-medium.bin

TRANSCRIBE_THREADS

The thread count for doing the subtitle transcribe task.

Supported values:

auto: Automatically determine the optimal transcribe thread count while taken the TRANSCRIBE_THREADS_NEGATIVE_OFFSET environment variable into consideration
(A natural number of user-specified thread count)

Default value: auto

TRANSCRIBE_THREADS_NEGATIVE_OFFSET

When the TRANSCRIBE_THREADS environment variable is set to auto, this environment variable determines the negative offset that will be applied to the transcribe thread count to allow user to adapt it to their system's optimal settings.

Supported values:

(A non-negative number that will be deducted from the detected thread count available to the process)

Default value:

1 (Deduct one from the detected available thread count, on a 8 total CPU thread system the optimal transcribe thread count will be determined to be 7)

WHISPERCPP_MAIN

Specify the base command of the Whisper.cpp main program. This environment variable allows users to use a different Whisper.cpp distribution other than the snap.

Supported values:

The path of a valid Whisper.cpp main program.
The base command of a valid Whisper.cpp main program, if it is in the command search PATHs.

Default value: whisper-cpp-main

Use the unofficial snap distribution's main app command.

Licensing

Unless otherwise noted(individual file's header/REUSE.toml), this product is licensed under the 3.0 version of the GNU Affero General Public License, or any of its more recent versions of your preference.

This work complies to the REUSE Specification, refer the REUSE - Make licensing easy for everyone website for info regarding the licensing of this product.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.github/workflows		.github/workflows
LICENSES		LICENSES
continuous-integration		continuous-integration
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitlab-ci.yml		.gitlab-ci.yml
.gitmodules		.gitmodules
.markdownlint.yml		.markdownlint.yml
.pre-commit-config.yaml		.pre-commit-config.yaml
.yamllint		.yamllint
README.md		README.md
REUSE.toml		REUSE.toml
docker-compose.yaml		docker-compose.yaml
transcribe-media-to-subtitles.sh		transcribe-media-to-subtitles.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

media-to-whisper.cpp-subtitles

Prerequisites

Usage

Environment variables that can change the utility's behaviors

GGML_MODEL

TRANSCRIBE_THREADS

TRANSCRIBE_THREADS_NEGATIVE_OFFSET

WHISPERCPP_MAIN

Licensing

About

Releases 4

Languages

brlin-tw/media-to-whisper.cpp-subtitles

Folders and files

Latest commit

History

Repository files navigation

media-to-whisper.cpp-subtitles

Prerequisites

Usage

Environment variables that can change the utility's behaviors

GGML_MODEL

TRANSCRIBE_THREADS

TRANSCRIBE_THREADS_NEGATIVE_OFFSET

WHISPERCPP_MAIN

Licensing

About

Topics

Resources

Stars

Watchers

Forks

Releases 4

Languages