Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 44 additions & 3 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,13 @@ Current architecture:

- Global shortcut handling goes through `KGlobalAccel`
- Audio capture uses Qt Multimedia
- Transcription is in-process through vendored `whisper.cpp`
- Transcription goes through an app-owned runtime seam with explicit runtime
selection
- A product-owned native CPU reference runtime scaffold now exists alongside
the legacy whisper adapter
- `whisper.cpp` is still the only real end-user speech decoder today, but the
vendored runtime is now optional at build time through
`MUTTERKEY_ENABLE_LEGACY_WHISPER=OFF`
- Native Mutterkey model packages are now the canonical model artifact; raw
whisper.cpp-compatible `.bin` files remain only as a migration/import path
- The public runtime seam is streaming-first through app-owned chunks, events, and compatibility helpers
Expand All @@ -22,7 +28,9 @@ Current architecture:
This repository is intentionally kept minimal:

- CMake is the only supported build system
- `whisper.cpp` is the only supported transcription backend
- `whisper.cpp` remains a vendored legacy backend, but new runtime ownership
work should prefer the product-owned native CPU path and selector/model-loader
seams first
- Keep the repo free of generated build output
- Keep publication-facing files free of machine-specific paths and broken local links
- Do not reintroduce legacy qmake or external-command transcription paths unless explicitly requested
Expand All @@ -37,11 +45,14 @@ This repository is intentionally kept minimal:
- `src/clipboardwriter.*`: clipboard integration, preferring KDE system clipboard support
- `src/audio/recordingnormalizer.*`: conversion to runtime-ready mono `float32` at `16 kHz`
- `src/transcription/audiochunker.*`: deterministic chunking of normalized audio for the streaming runtime path
- `src/transcription/cpureferencemodel.*`: product-owned native CPU reference model header/parser and immutable model-handle loading
- `src/transcription/cpureferencetranscriber.*`: native CPU reference runtime scaffold behind the app-owned engine/session seam
- `src/transcription/modelpackage.*`: product-owned manifest and validated package value types
- `src/transcription/modelvalidator.*`: package integrity, compatibility, and bounds validation
- `src/transcription/modelcatalog.*`: model artifact inspection and resolution
- `src/transcription/rawwhisperprobe.*`: lightweight raw whisper.cpp header inspection used for migration compatibility
- `src/transcription/rawwhisperimporter.*`: import path from raw Whisper `.bin` files into native Mutterkey packages
- `src/transcription/runtimeselector.*`: app-owned runtime-selection policy and diagnostic reasoning
- `src/transcription/transcriptassembler.*`: final transcript assembly from streaming transcript events
- `src/transcription/transcriptioncompat.*`: compatibility wrapper that routes one-shot recordings through the streaming runtime seam
- `src/transcription/whispercpptranscriber.*`: in-process Whisper integration and whisper-specific engine construction
Expand Down Expand Up @@ -86,6 +97,13 @@ cmake -S . -B "$BUILD_DIR" -G Ninja
cmake --build "$BUILD_DIR" -j"$(nproc)"
```

To validate the native-runtime-only path without vendored `whisper.cpp` /
`ggml`, configure with:

```bash
cmake -S . -B "$BUILD_DIR" -G Ninja -DMUTTERKEY_ENABLE_LEGACY_WHISPER=OFF
```

If a sandboxed build fails with `ccache: error: Read-only file system`, treat
that as an environment limitation rather than a repo regression and rerun the
build with `CCACHE_DISABLE=1`.
Expand Down Expand Up @@ -136,6 +154,9 @@ Notes:
- Use `bash scripts/check-release-hygiene.sh` when touching publication-facing files such as `README.md`, licenses, `contrib/`, CI, or helper scripts
- Use `cmake --build "$BUILD_DIR" --target docs` when touching repo-owned public headers, Doxygen config, the Doxygen main page, or CI/docs wiring
- If install rules or licensing files change, confirm the temporary install contains the expected files under `share/licenses/mutterkey`
- If a task changes runtime selection, native model loading, or legacy-whisper
build toggles, validate at least one `MUTTERKEY_ENABLE_LEGACY_WHISPER=OFF`
build in addition to the normal default build
- If you add or change public methods in repo-owned headers, expect `cmake --build "$BUILD_DIR" --target docs` to fail until the new API is documented; treat that as part of the normal implementation loop, not follow-up polish
- Newly added repo-owned public structs and free functions in public headers also
need Doxygen comments immediately; the `docs` target treats undocumented new
Expand All @@ -158,6 +179,12 @@ Notes:
- When validating inside a restricted sandbox, be ready to disable `ccache` with `CCACHE_DISABLE=1` if the cache location is read-only; that is an execution-environment issue, not a Mutterkey build failure
- Prefer fixing the code over weakening `.clang-tidy` or the Clazy check set; only relax tool config when the warning is clearly low-value for this repo
- If `clang-tidy` flags a new small enum for `performance-enum-size`, prefer an explicit narrow underlying type such as `std::uint8_t` instead of suppressing the warning
- If `clang-tidy` flags a small fixed binary header type, prefer
`std::array<std::byte, N>` or `std::array<char, N>` plus value
initialization over C-style arrays
- When helper functions take two adjacent same-shaped parameters such as two
`QString` values, prefer a small request struct when that keeps tests and
runtime code from tripping `bugprone-easily-swappable-parameters`
- In this Qt-heavy repo, treat `misc-include-cleaner` and `readability-redundant-access-specifiers` as low-value `clang-tidy` noise unless the underlying tool behavior improves; they conflict with Qt header-provider reality and `signals` / `slots` / `Q_SLOTS` sectioning more than they improve safety
- Prefer anonymous-namespace `Q_LOGGING_CATEGORY` for file-local logging categories; `Q_STATIC_LOGGING_CATEGORY` is not portable enough across the Qt versions this repo may build against
- Do not add broad Valgrind suppressions by default; only add narrow suppressions after reproducing stable third-party noise and keep them clearly scoped
Expand All @@ -183,7 +210,15 @@ Notes:
- Keep JSON and other transport details at subsystem boundaries; prefer typed C++ snapshots/results once data crosses into app-owned control, tray, or service code
- Prefer dependency injection for tray-shell and control-surface code from the first implementation so headless Qt tests stay simple
- When preparing the transcription path for future runtime work, prefer app-owned engine/session seams and injected sessions over leaking concrete backend types into CLI, service, or worker orchestration. Keep immutable capability reporting on the engine side, keep runtime inspection data in `RuntimeDiagnostics`, and keep the session side focused on mutable decode state, warmup, chunk ingestion, finish, and cancellation
- Prefer product-owned runtime interfaces, model/session separation, and deterministic backend selection before adding new inference backends or widening cross-platform support
- Prefer product-owned runtime interfaces, model/session separation, explicit
runtime-selection policy, and deterministic backend selection before adding
new inference backends or widening cross-platform support
- Keep runtime-selection policy in `src/transcription/runtimeselector.*`
instead of burying compatibility/fallback rules inside
`createTranscriptionEngine()`
- Keep native model-format parsing and immutable model loading in
`src/transcription/cpureferencemodel.*` or similar app-owned loader code
rather than mixing artifact parsing into the mutable session implementation
- Keep model validation, metadata extraction, and compatibility checks app-owned.
`whisper.cpp` should not be the first component that tells Mutterkey whether a
model artifact is obviously malformed, incompatible, or oversized
Expand Down Expand Up @@ -218,6 +253,9 @@ Apply the C++ Core Guidelines selectively and pragmatically. For this repo, the
- Prefer resolving model-package, metadata, and import work entirely in app-owned
code. Raw whisper.cpp `.bin` support is now a compatibility/import concern, not
the canonical product contract
- Prefer treating `whisper.cpp` as a legacy migration/parity dependency from
here forward. If new work can land in app-owned selector, model-loader,
native-runtime, or package code instead, do that first
- Prefer keeping fake runtime tests and app-owned helpers free of vendored whisper linkage unless the test is specifically about the whisper adapter or engine factory
- Prefer fixing vendored target metadata from the top-level CMake when the issue is Mutterkey packaging or warning noise, instead of patching upstream vendored files directly
- If you must modify vendored code, document why in the final response and record the deviation in `third_party/whisper.cpp.UPSTREAM.md`
Expand All @@ -233,6 +271,9 @@ Apply the C++ Core Guidelines selectively and pragmatically. For this repo, the
separate release asset outside Git
- Do not introduce machine-specific home-directory paths, absolute local Markdown links, or generated build artifacts into tracked files
- If a task changes install layout or shipped assets, keep the CMake install rules and license installs aligned with the new behavior
- If a task changes whether legacy whisper support is installed, keep
`README.md`, `RELEASE_CHECKLIST.md`, `docs/mainpage.md`, install rules, and
license installs aligned with that choice
- The installed shared-library payload is runtime-focused; do not start installing vendored upstream public headers unless the package contract intentionally changes

## Config Expectations
Expand Down
94 changes: 58 additions & 36 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ set(CMAKE_CXX_EXTENSIONS OFF)

option(MUTTERKEY_ENABLE_ASAN "Enable AddressSanitizer for repo-owned code and vendored whisper.cpp" OFF)
option(MUTTERKEY_ENABLE_UBSAN "Enable UndefinedBehaviorSanitizer for repo-owned code and vendored whisper.cpp" OFF)
option(MUTTERKEY_ENABLE_LEGACY_WHISPER "Build the legacy whisper.cpp runtime for migration and parity validation" ON)
option(MUTTERKEY_ENABLE_WHISPER_CUDA "Enable whisper.cpp CUDA backend support (NVIDIA)" OFF)
option(MUTTERKEY_ENABLE_WHISPER_VULKAN "Enable whisper.cpp Vulkan backend support" OFF)
option(MUTTERKEY_ENABLE_WHISPER_BLAS "Enable whisper.cpp BLAS CPU acceleration" OFF)
Expand Down Expand Up @@ -47,6 +48,10 @@ set(MUTTERKEY_CORE_SOURCES
src/transcription/transcriptionengine.h
src/transcription/audiochunker.cpp
src/transcription/audiochunker.h
src/transcription/cpureferencemodel.cpp
src/transcription/cpureferencemodel.h
src/transcription/cpureferencetranscriber.cpp
src/transcription/cpureferencetranscriber.h
src/transcription/modelcatalog.cpp
src/transcription/modelcatalog.h
src/transcription/modelpackage.cpp
Expand All @@ -57,16 +62,23 @@ set(MUTTERKEY_CORE_SOURCES
src/transcription/rawwhisperimporter.h
src/transcription/rawwhisperprobe.cpp
src/transcription/rawwhisperprobe.h
src/transcription/runtimeselector.cpp
src/transcription/runtimeselector.h
src/transcription/transcriptassembler.cpp
src/transcription/transcriptassembler.h
src/transcription/transcriptioncompat.cpp
src/transcription/transcriptioncompat.h
src/transcription/transcriptionworker.cpp
src/transcription/transcriptionworker.h
src/transcription/whispercpptranscriber.cpp
src/transcription/whispercpptranscriber.h
)

if(MUTTERKEY_ENABLE_LEGACY_WHISPER)
list(APPEND MUTTERKEY_CORE_SOURCES
src/transcription/whispercpptranscriber.cpp
src/transcription/whispercpptranscriber.h
)
endif()

set(MUTTERKEY_CONTROL_SOURCES
src/control/daemoncontrolclient.cpp
src/control/daemoncontrolclient.h
Expand Down Expand Up @@ -114,6 +126,10 @@ set_target_properties(mutterkey-tray PROPERTIES
INSTALL_RPATH "$ORIGIN/../lib"
)

if(MUTTERKEY_ENABLE_LEGACY_WHISPER)
target_compile_definitions(mutterkey_core PRIVATE MUTTERKEY_WITH_LEGACY_WHISPER)
endif()

function(mutterkey_enable_sanitizers target_name)
if(NOT CMAKE_CXX_COMPILER_ID MATCHES "GNU|Clang|AppleClang")
message(WARNING "Sanitizers were requested, but ${CMAKE_CXX_COMPILER_ID} is not configured for repo-owned sanitizer flags")
Expand Down Expand Up @@ -203,47 +219,53 @@ else()
message(STATUS "Doxygen not found; the docs target will be unavailable")
endif()

if(NOT EXISTS "${CMAKE_CURRENT_SOURCE_DIR}/third_party/whisper.cpp/CMakeLists.txt")
message(FATAL_ERROR "Vendored whisper.cpp dependency is missing from third_party/whisper.cpp")
endif()

set(WHISPER_BUILD_TESTS OFF CACHE BOOL "" FORCE)
set(WHISPER_BUILD_EXAMPLES OFF CACHE BOOL "" FORCE)
set(WHISPER_BUILD_SERVER OFF CACHE BOOL "" FORCE)
set(WHISPER_SANITIZE_ADDRESS ${MUTTERKEY_ENABLE_ASAN} CACHE BOOL "" FORCE)
set(WHISPER_SANITIZE_UNDEFINED ${MUTTERKEY_ENABLE_UBSAN} CACHE BOOL "" FORCE)
set(GGML_CUDA ${MUTTERKEY_ENABLE_WHISPER_CUDA} CACHE BOOL "" FORCE)
set(GGML_VULKAN ${MUTTERKEY_ENABLE_WHISPER_VULKAN} CACHE BOOL "" FORCE)
set(GGML_BLAS ${MUTTERKEY_ENABLE_WHISPER_BLAS} CACHE BOOL "" FORCE)
set(GGML_BLAS_VENDOR ${MUTTERKEY_WHISPER_BLAS_VENDOR} CACHE STRING "" FORCE)
add_subdirectory(third_party/whisper.cpp EXCLUDE_FROM_ALL)

# Mutterkey ships the vendored shared libraries, but it does not install their
# upstream public headers as part of its own package layout.
set_target_properties(whisper ggml PROPERTIES PUBLIC_HEADER "")
if(MUTTERKEY_ENABLE_LEGACY_WHISPER)
if(NOT EXISTS "${CMAKE_CURRENT_SOURCE_DIR}/third_party/whisper.cpp/CMakeLists.txt")
message(FATAL_ERROR "Vendored whisper.cpp dependency is missing from third_party/whisper.cpp")
endif()

target_link_libraries(mutterkey_core PRIVATE whisper)
set(WHISPER_BUILD_TESTS OFF CACHE BOOL "" FORCE)
set(WHISPER_BUILD_EXAMPLES OFF CACHE BOOL "" FORCE)
set(WHISPER_BUILD_SERVER OFF CACHE BOOL "" FORCE)
set(WHISPER_SANITIZE_ADDRESS ${MUTTERKEY_ENABLE_ASAN} CACHE BOOL "" FORCE)
set(WHISPER_SANITIZE_UNDEFINED ${MUTTERKEY_ENABLE_UBSAN} CACHE BOOL "" FORCE)
set(GGML_CUDA ${MUTTERKEY_ENABLE_WHISPER_CUDA} CACHE BOOL "" FORCE)
set(GGML_VULKAN ${MUTTERKEY_ENABLE_WHISPER_VULKAN} CACHE BOOL "" FORCE)
set(GGML_BLAS ${MUTTERKEY_ENABLE_WHISPER_BLAS} CACHE BOOL "" FORCE)
set(GGML_BLAS_VENDOR ${MUTTERKEY_WHISPER_BLAS_VENDOR} CACHE STRING "" FORCE)
add_subdirectory(third_party/whisper.cpp EXCLUDE_FROM_ALL)

# Mutterkey ships the vendored shared libraries, but it does not install their
# upstream public headers as part of its own package layout.
set_target_properties(whisper ggml PROPERTIES PUBLIC_HEADER "")

target_link_libraries(mutterkey_core PRIVATE whisper)
endif()

install(TARGETS mutterkey RUNTIME DESTINATION ${CMAKE_INSTALL_BINDIR})
install(TARGETS mutterkey-tray RUNTIME DESTINATION ${CMAKE_INSTALL_BINDIR})
install(TARGETS whisper ggml ggml-base
LIBRARY DESTINATION ${CMAKE_INSTALL_LIBDIR}
)
if(TARGET ggml-cpu)
install(TARGETS ggml-cpu LIBRARY DESTINATION ${CMAKE_INSTALL_LIBDIR})
endif()
if(TARGET ggml-cuda)
install(TARGETS ggml-cuda LIBRARY DESTINATION ${CMAKE_INSTALL_LIBDIR})
endif()
if(TARGET ggml-vulkan)
install(TARGETS ggml-vulkan LIBRARY DESTINATION ${CMAKE_INSTALL_LIBDIR})
endif()
if(TARGET ggml-blas)
install(TARGETS ggml-blas LIBRARY DESTINATION ${CMAKE_INSTALL_LIBDIR})
if(MUTTERKEY_ENABLE_LEGACY_WHISPER)
install(TARGETS whisper ggml ggml-base
LIBRARY DESTINATION ${CMAKE_INSTALL_LIBDIR}
)
if(TARGET ggml-cpu)
install(TARGETS ggml-cpu LIBRARY DESTINATION ${CMAKE_INSTALL_LIBDIR})
endif()
if(TARGET ggml-cuda)
install(TARGETS ggml-cuda LIBRARY DESTINATION ${CMAKE_INSTALL_LIBDIR})
endif()
if(TARGET ggml-vulkan)
install(TARGETS ggml-vulkan LIBRARY DESTINATION ${CMAKE_INSTALL_LIBDIR})
endif()
if(TARGET ggml-blas)
install(TARGETS ggml-blas LIBRARY DESTINATION ${CMAKE_INSTALL_LIBDIR})
endif()
endif()
install(FILES contrib/org.mutterkey.mutterkey.desktop DESTINATION ${CMAKE_INSTALL_DATADIR}/applications)
install(FILES LICENSE THIRD_PARTY_NOTICES.md DESTINATION ${MUTTERKEY_LICENSE_INSTALL_DIR})
install(FILES third_party/whisper.cpp/LICENSE DESTINATION ${MUTTERKEY_LICENSE_INSTALL_DIR}/third_party/whisper.cpp)
if(MUTTERKEY_ENABLE_LEGACY_WHISPER)
install(FILES third_party/whisper.cpp/LICENSE DESTINATION ${MUTTERKEY_LICENSE_INSTALL_DIR}/third_party/whisper.cpp)
endif()

if(BUILD_TESTING)
find_package(Qt6 REQUIRED COMPONENTS Test)
Expand Down
5 changes: 5 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -131,6 +131,10 @@ This installs:
- `~/.local/lib/libwhisper.so*` and the required `ggml` libraries
- `~/.local/share/applications/org.mutterkey.mutterkey.desktop`

If you configure with `-DMUTTERKEY_ENABLE_LEGACY_WHISPER=OFF`, Mutterkey builds
without the vendored `whisper.cpp` runtime and does not install the legacy
`libwhisper` / `ggml` shared libraries.

Optional acceleration flags:

```bash
Expand Down Expand Up @@ -164,6 +168,7 @@ Notes:
- `MUTTERKEY_ENABLE_WHISPER_VULKAN=ON` is for Vulkan-capable GPUs and requires Vulkan development headers and loader libraries
- `MUTTERKEY_ENABLE_WHISPER_BLAS=ON` improves CPU inference speed rather than enabling GPU execution
- these options are forwarded to the vendored `whisper.cpp` / `ggml` build and install any resulting backend libraries alongside Mutterkey
- `-DMUTTERKEY_ENABLE_LEGACY_WHISPER=OFF` disables the vendored runtime entirely and skips all `whisper.cpp` / `ggml` install targets

### 2. Put a model on disk

Expand Down
Loading
Loading