whisper-api: Asynchronous Speech-to-Text for Emacs

whisper-api is a lightweight asynchronous client for OpenAI’s Whisper API.

Installation

Make sure you have ffmpeg installed on your system. For example, on Ubuntu:

$ sudo apt update && sudo apt install ffmpeg

Using straight.el and general.el (if you use them), add this to your configuration:

(use-package whisper-api
  :straight (:host github :repo "ileixe/whisper-api")
  :config
  (setq whisper-api-openai-token "YOUR-API-KEY") ;; or you can uses .authinfo file
  :general
  ;; Global bindings for normal, visual and insert states
  (:states '(normal visual insert)
           "C-c w" 'whisper-api-record-dwim
           "C-c c" 'whisper-api-cancel)
  ;; Bindings for the minibuffer-keymaps so that they also work inside the minibuffer.
  ;; (:keymaps '(minibuffer-local-map minibuffer-local-ns-map)
  ;;          "C-c w" 'whisper-api-record-dwim
  ;;          "C-c c" 'whisper-api-cancel)
  )

Features

Non-blocking recording and transcription.
Graceful cancellation of recording or pending API requests.
Supports reading the API key from the user variable or from authinfo.
Customizable ffmpeg command (default uses PulseAudio, but can be adjusted to ALSA or other sources).

Usage

Set your API key, for example in your init file:
(setq whisper-api-openai-token "sk-...")

If not set, the package will fallback on .authinfo file:

machine api.openai.com login apikey password YOUR_API_KEY
To start recording, run:
M-x whisper-api-record-dwim

The temporary file path is displayed in the minibuffer.
To stop recording (which sends the audio file to the API asynchronously), run:
M-x whisper-api-record-dwim (again)

The transcription is then inserted into your current buffer.
To cancel an active recording or pending API call, run:
M-x whisper-api-cancel

Customization

whisper-api-ffmpeg-command
The format string used to invoke ffmpeg. By default it is:

ffmpeg -y -t %d -f pulse -i default -ar 16000 %s

You can modify this if you use another input device; for instance, if you prefer ALSA use:

(setq whisper-api-ffmpeg-command "ffmpeg -y -t %d -f alsa -i default -ar 16000 %s")
whisper-api-base-url
The base URL for the Whisper API. By default it is set to OpenAI’s endpoint:

https://api.openai.com/v1/audio/transcriptions

To use a local inference endpoint or alternative service, set this value accordingly. For example:

(setq whisper-api-base-url "http://localhost:8000/my-endpoint")
The package prints debug messages in the Messages buffer, which can help diagnose issues with the recording or API call.

Enjoy!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.org

README.org

whisper-api: Asynchronous Speech-to-Text for Emacs

Installation

Features

Usage

Customization

Files

README.org

Latest commit

History

README.org

File metadata and controls

whisper-api: Asynchronous Speech-to-Text for Emacs

Installation

Features

Usage

Customization