Transcribe Offline is an open‑source desktop application for on‑device transcription with speaker diarisation (assigning who said what) and word‑level alignment for crisp subtitles. It also includes local LLM helpers for summarisation, translation, and light editing — all without sending audio to remote servers by default.
Designed to be local‑first. However, no software can guarantee absolute privacy or security. Please consider your threat model and institutional policies before processing sensitive material.
- Offline by default. Process recordings on your own machine — no accounts or routine uploads.
- Predictable costs. Open source and free to use — no per‑minute API bills; compute runs on your CPU/GPU.
- Research‑friendly. Reproducible settings and versioned models help you document methods for scholarly work.
- Flexible exports. Clean transcripts, subtitles, and data‑friendly formats that drop into existing workflows.
Built on OpenAI Whisper, producing robust, punctuated transcripts across diverse accents and domains.
WhisperX and open‑source audio‑labelling pipelines refine timestamps down to each word, enabling:
- Frame‑accurate SRT/VTT with natural line breaks
- Click‑to‑play navigation (double‑click a line to jump audio)
- Playback‑synced highlighting that follows words in real time
pyannote separates speakers and assigns consistent labels, producing:
- Speaker‑attributed paragraphs (Speaker 01/02/…)
- When alignment is available, speaker labels at the word level (excellent for subtitles)
Run a small local LLM (Qwen 3) for:
- Summaries of long sessions
- Draft translations (quality varies by language)
- Grammar and punctuation clean‑ups
- Custom prompts to “talk to your transcript” for outlines, show notes, or action lists
- Readable transcripts TXT with timestamps and Speaker 01/02 labels
- Subtitles: SRT / VTT
- Data‑friendly formats: CSV / JSON for analysis and search
- Editor quality‑of‑life: playback‑synced highlighting; double‑click any line to jump; edits autosave alongside your audio
Researchers, lecturers, and professional practitioners who need dependable, local transcription:
- Qualitative & ethnographic research: interviews, focus groups, field recordings
- Lecture capture & seminars: searchable notes and teaching materials
- Media & podcasts: subtitle passes and episode notes
- Linguistics & HCI: fine‑grained timing for annotation workflows
- Transcription: Whisper models
- Alignment: WhisperX + open‑source audio labelling for word‑level timings (English)
- Diarisation: pyannote pipelines for speaker turns (labels such as Speaker 01/02/…)
- Local LLM: Qwen 3 for summarise/translate/correct/custom prompts ---
- Designed for modern desktops and laptops.
- Memory: ~16 GB RAM is a sensible baseline for comfortable use on MacOS and a must on Windows machines.
- macOS project page: https://github.com/openresearchtools/transcribeoffline/tree/main/macOS
- Windows project page: https://github.com/openresearchtools/transcribeoffline/tree/main/Windows
- Ubuntu Linux project page: https://github.com/openresearchtools/transcribeoffline/tree/main/Linux
If you’re evaluating for institutional use, test with non‑sensitive audio first, then review logs/exports with your IT or data guardian.
- Add audio/video. Drag files in and queue for processing.
- Choose tasks. Transcribe; optionally enable diarisation and word‑alignment.
- (Optional) Apply local LLM helpers. Summarise, translate, or tidy the prose.
- Export. Save TXT/JSON/CSV or SRT/VTT and drop them straight into your analysis or editing workflow.
- Navigate. Double‑click any line to jump playback; highlighting follows the audio.
- Alignment: word‑level alignment is currently only available in English.
- Diarisation accuracy: depends on audio quality and speaker overlap; manual review is recommended for high‑stakes use.
- LLM helpers: outputs may contain mistakes; treat them as drafting aids rather than ground truth.
- Accents & domains: Whisper is strong overall, but niche jargon or heavy code‑switching may need light edits.
We welcome issues and pull requests. For sizeable changes, please open an issue first to discuss what you’d like to add or modify. Bug reports that include logs, hardware specs, OS details, and a small sample help us reproduce problems quickly.
Released under the MIT Licence. Third‑party models and tools are subject to their own licences and terms — please review the linked model pages.
Suggested citation
Rutkauskas, L. (2025). Transcribe Offline (Version 1.2) [Computer software]. openresearchtools.com. https://github.com/openresearchtools/transcribeoffline. MIT Licence. Released 25 September 2025.
BibTeX
@software{Rutkauskas_TranscribeOffline_2025,
author = {Rutkauskas, L.},
title = {Transcribe Offline},
version = {1.2},
date = {2025-09-25},
url = {https://github.com/openresearchtools/transcribeoffline},
publisher = {openresearchtools.com},
license = {MIT}
}Built as an open tool for research and teaching. Stars, issues, and community examples are very welcome.
