Skip to content

A macos app that simply transcribes what you hear in system audio and displays it in the window. No more, no less.

Notifications You must be signed in to change notification settings

kkudumu/simpletranscribe

Repository files navigation

simpletranscribe

A minimal macOS menu bar app that transcribes system audio in real time using Apple’s Speech framework (SFSpeechRecognizer) and ScreenCaptureKit for audio capture. The app provides a simple window to view the live transcript, a Start/Stop control, an “Always on Top” toggle, and menu bar access.

Features

  • Real-time transcription of system audio (not just microphone)
  • Menu bar control to Start/Stop and bring window to front
  • “Always on Top” toggle to keep the transcript window floating
  • Partial results appear continuously; finalized segments are indicated in the status

Requirements

  • macOS 14.0+ (Sonoma)
  • Xcode 15+ (recommended)
  • App requires the following permissions:
    • Screen Recording (to capture system audio via ScreenCaptureKit)
    • Speech Recognition (for SFSpeechRecognizer)

Note: Some locales require a network connection for recognition unless on-device models are available. This app defaults to en-US.

Getting Started

Open and run in Xcode

  1. Open simpletranscribe.xcodeproj in Xcode.
  2. Select the simpletranscribe scheme and a “My Mac” destination.
  3. In Signing & Capabilities, select your Development Team if prompted.
  4. Build and run (⌘R).
  5. On first run, macOS will prompt for permissions:
    • Allow Screen Recording for the app (System Settings → Privacy & Security → Screen Recording).
    • Allow Speech Recognition (System Settings → Privacy & Security → Speech Recognition).

Using the app

  • Click the menu bar item (waveform icon) to Start/Stop transcription, toggle “Always on Top,” or show the main window.
  • Press Space to Start/Stop from the main window.
  • The transcript appears in the scrollable text area.
  • Status messages (e.g., “Listening…”, errors, or “Finalized segment”) appear in the UI and the menu bar menu.

Configuration

  • Locale: The recognizer defaults to en-US. You can change the locale by editing TranscriptionService initialization in ContentViewModel.
  • On-device recognition: The code currently sets requiresOnDeviceRecognition = false. Change this if you want to force on-device only where available.

How it works (high level)

  • AudioCaptureManager uses ScreenCaptureKit to start a stream that captures system audio (excluding the app’s own audio) and forwards CMSampleBuffers.
  • TranscriptionService wraps SFSpeechRecognizer, feeding it audio buffers and publishing partial/final text updates.
  • ContentViewModel wires capture → transcription, exposes state to SwiftUI, and handles Start/Stop.
  • ContentView and MenuBarExtra provide the minimal UI.

Troubleshooting

  • No audio or text is appearing:
    • Ensure you granted Screen Recording and Speech Recognition permissions.
    • Quit and relaunch the app after changing permissions in System Settings.
  • “Speech recognition not authorized”:
    • System Settings → Privacy & Security → Speech Recognition → enable for this app.
  • Build/Signing errors:
    • Open the project in Xcode, go to the simpletranscribe target → Signing & Capabilities, and set your Development Team.
  • Locale issues or low accuracy:
    • Try a different locale identifier (e.g., en-GB, fr-FR) in the TranscriptionService initializer.

Development Notes

  • This project uses SwiftUI and Apple frameworks only (no external dependencies).
  • Item.swift includes a placeholder SwiftData model that isn’t currently used by the app logic.
  • ScreenCaptureKit requires relatively recent macOS; this project targets macOS 14+.

Roadmap (ideas)

  • Export/save transcripts
  • Microphone-only mode option
  • Choose capture source (specific app/window)
  • Configurable locale and on-device-only toggle in UI

License

Add a license of your choice (e.g., MIT, Apache 2.0) before publishing.

About

A macos app that simply transcribes what you hear in system audio and displays it in the window. No more, no less.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages