Skip to content

Converting user-specified media files to subtitles using the Whisper.cpp utility.

Notifications You must be signed in to change notification settings

brlin-tw/media-to-whisper.cpp-subtitles

Repository files navigation

media-to-whisper.cpp-subtitles

Converting user-specified media files to subtitles using the Whisper.cpp utility.

https://gitlab.com/brlin/media-to-whisper.cpp-subtitles
The GitLab CI pipeline status badge of the project's main branch GitHub Actions workflow status badge pre-commit enabled badge REUSE Specification compliance badge

Prerequisites

You need to have the following software installed and it's command available in the command search PATHs:

  • GNU core utilities
    For determining the absolute path of the utility and the available threads to do the subtitle inference.

  • whisper.cpp
    For inferencing the subtitles from the media's audio tracks.

    By default it uses the unofficial snap distribution of Whisper.cpp.

  • FFmpeg
    For converting the input media into formats that can be consumed by whisper.cpp.

Usage

Refer to the following instructions to use this application:

  1. Download the application's release package from the Releases page.

  2. Extract the downloaded application release package using your preferred archive manipulation utility.

  3. Launch your preferred text terminal emulator application.

  4. Refer to the Environment variables that can change the utility's behaviors section for environment variables that can change the utility's behaviors according to your preference and run the utility by running the following command:

    _ENV_VAR_NAME1_=_env_var_value1_ _ENV_VAR_NAME2_=_env_var_value2_... \
        /path/to/media-to-whisper.cpp-subtitles/transcribe-media-to-subtitles.sh \
        _input_media_1_ _input_media_2_...

    The generated subtitles files will be saved in the same directory with the input files.

Environment variables that can change the utility's behaviors

The following environment variables can change the utility's behaviors according to your preference, use these environment variables to optimize your workload:

GGML_MODEL

The whisper.cpp model file to infer the subtitles.

Supported values:

  • (Absolute path of the model file you want to use)
  • (Relative path of the model file you want to use)

Default value: ggml-medium.bin

TRANSCRIBE_THREADS

The thread count for doing the subtitle transcribe task.

Supported values:

Default value: auto

TRANSCRIBE_THREADS_NEGATIVE_OFFSET

When the TRANSCRIBE_THREADS environment variable is set to auto, this environment variable determines the negative offset that will be applied to the transcribe thread count to allow user to adapt it to their system's optimal settings.

Supported values:

(A non-negative number that will be deducted from the detected thread count available to the process)

Default value:

1 (Deduct one from the detected available thread count, on a 8 total CPU thread system the optimal transcribe thread count will be determined to be 7)

WHISPERCPP_MAIN

Specify the base command of the Whisper.cpp main program. This environment variable allows users to use a different Whisper.cpp distribution other than the snap.

Supported values:

  • The path of a valid Whisper.cpp main program.
  • The base command of a valid Whisper.cpp main program, if it is in the command search PATHs.

Default value: whisper-cpp-main

Use the unofficial snap distribution's main app command.

Licensing

Unless otherwise noted(individual file's header/REUSE.toml), this product is licensed under the 3.0 version of the GNU Affero General Public License, or any of its more recent versions of your preference.

This work complies to the REUSE Specification, refer the REUSE - Make licensing easy for everyone website for info regarding the licensing of this product.