-
Notifications
You must be signed in to change notification settings - Fork 0
Parakeet TDT v2 implementation with chunking, #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
33 commits
Select commit
Hold shift + click to select a range
889675f
got basic eddy form running
Alex-Wengg 0c01b95
integrated the chunking algo
Alex-Wengg fb9342e
download in cpp of the dataset
Alex-Wengg 7218d01
streaming algo service
Alex-Wengg 7eb3e4b
the inner tdt skip blank loop
Alex-Wengg c3ce93b
add normalizer
Alex-Wengg d18481e
text normatlization
Alex-Wengg 60aa7cc
support NPU now
Alex-Wengg b782564
download hg models with cpp
Alex-Wengg 66cb78b
parakeet: fix windowed mel; port FluidAudio dedup; type‑safety; chunk…
Alex-Wengg 74815b8
first comment address
Alex-Wengg f184672
2nd phase of comments addressc:wq
Alex-Wengg 6183311
address commenets and breakup the openvino parakeet file c
Alex-Wengg 209e4df
3 phase of changes
Alex-Wengg b6f2c64
4th phase of comments addressed
Alex-Wengg 755ffa6
update benchmark pipeline
Alex-Wengg e325f81
updated to support fluidinference repo model download instead
Alex-Wengg 8616c66
Brandon's minor nit edits
BrandonWeng 8a5ae4a
mel spec on CPU
BrandonWeng 6339328
update benchmark.py
Alex-Wengg a339223
Replace dr_wav with libsndfile + libsamplerate; mel spec on CPU
Alex-Wengg 45ddb87
Refactor tokenizer, remove auto-fetch, simplify preprocessor
Alex-Wengg 5447738
Document why parakeet_openvino_impl.hpp is in src/ not include/
Alex-Wengg 26744c5
Clarify preprocessor comments: no dynamic input, just short vs long a…
Alex-Wengg 8a70eca
Add spacing and comments to parakeet_decoder.cpp for readability
Alex-Wengg 9478f3f
Refactor: address all PR code review feedback
Alex-Wengg 7824170
Fix preprocessor dynamic shape handling to restore 2% WER
Alex-Wengg ebe7d06
Refactor: clean up error handling and improve code spacing
Alex-Wengg 335def4
3rd round of code comments to address
Alex-Wengg 0241967
addressing cpp code risks
Alex-Wengg 544e3f9
minor comments address
Alex-Wengg ef961f4
minor comments address
Alex-Wengg 10f5cd3
improve downloading models for users
Alex-Wengg File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,83 +1,54 @@ | ||
| # Prerequisites | ||
| *.d | ||
|
|
||
| # Compiled Object files | ||
| *.slo | ||
| *.lo | ||
| *.o | ||
| *.obj | ||
|
|
||
| # Precompiled Headers | ||
| *.gch | ||
| *.pch | ||
|
|
||
| # Linker files | ||
| *.ilk | ||
|
|
||
| # Debugger Files | ||
| *.pdb | ||
|
|
||
| # Compiled Dynamic libraries | ||
| *.so | ||
| *.dylib | ||
| *.dll | ||
|
|
||
| # Fortran module files | ||
| *.mod | ||
| *.smod | ||
|
|
||
| # Compiled Static libraries | ||
| *.lai | ||
| *.la | ||
| *.a | ||
| *.lib | ||
|
|
||
| # Executables | ||
| *.exe | ||
| *.out | ||
| *.app | ||
|
|
||
| # debug information files | ||
| *.dwo | ||
|
|
||
| # Build directories | ||
| build/ | ||
| cmake-build-*/ | ||
| out/ | ||
|
|
||
| # Model files and cache | ||
| models/ | ||
| cache/ | ||
| *.blob | ||
| # Model files (too large for git) | ||
| models/**/*.bin | ||
| models/**/*.blob | ||
|
|
||
| # Test audio (can be large) | ||
| test_audio/ | ||
|
|
||
| # IDE | ||
| .vscode/ | ||
| .vs/ | ||
| .idea/ | ||
| *.swp | ||
| *.swo | ||
| .claude/ | ||
| *.suo | ||
| *.user | ||
|
|
||
| # Python | ||
| __pycache__/ | ||
| *.pyc | ||
| .venv/ | ||
| venv/ | ||
| # Compiled | ||
| *.o | ||
| *.obj | ||
| *.so | ||
| *.dll | ||
| *.dylib | ||
| *.a | ||
| *.lib | ||
|
|
||
| # Node.js | ||
| node_modules/ | ||
| dist/ | ||
| *.tsbuildinfo | ||
| # Cache | ||
| .cache/ | ||
| *.cache | ||
| cache/ | ||
|
|
||
| # .NET | ||
| bin/ | ||
| obj/ | ||
| *.user | ||
| *.suo | ||
| # Build artifacts | ||
| **/obj/ | ||
| **/bin/Debug/ | ||
| **/bin/Release/ | ||
|
|
||
| # OS | ||
| .DS_Store | ||
| Thumbs.db | ||
| # IDE/Editor specific | ||
| .claude/ | ||
| *.sln | ||
| *.vcxproj | ||
| *.vcxproj.filters | ||
| *.vcxproj.user | ||
|
|
||
| # Test files | ||
| test_*.wav | ||
| test_*.py | ||
| *_test.cpp | ||
|
|
||
| # Large archives | ||
| *.zip | ||
| openvino_*/ | ||
|
|
||
| # Temporary scripts | ||
| download_*.bat | ||
| download_*.sh | ||
| run_*.bat |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,58 @@ | ||
| # Repository Guidelines | ||
|
|
||
| ## Project Structure & Module Organization | ||
|
|
||
| - Source in `src/` (core, backends, `models/parakeet/`, pipelines, streaming); public headers in `include/eddy/`. | ||
| - Examples in `examples/cpp/` (`parakeet_cli.cpp`, `benchmark_librispeech.cpp`, optional `whisper_example.cpp`). | ||
| - Scripts in `scripts/` (model fetch); docs in `docs/` (e.g., `docs/Benchmark-Troubleshooting.md`). | ||
| - Assets: sample WAVs in repo root; models live under per-user app data, not in git. | ||
|
|
||
| ## Build, Test, and Development Commands | ||
|
|
||
| - Configure: `cmake -B build -S . -DCMAKE_BUILD_TYPE=Release -DEDDY_ENABLE_OPENVINO=ON` | ||
| - Build tools: `cmake --build build --config Release --target eddy parakeet_cli benchmark_librispeech hf_fetch_models` | ||
| - Run CLI (NPU): `build\examples\cpp\Release\parakeet_cli.exe "<path-to-wav>" --device NPU` | ||
| - Benchmark (NPU): `build\examples\cpp\Release\benchmark_librispeech.exe --max-files 50 --device NPU` | ||
| - Tests (optional): configure with `-DBUILD_TESTING=ON`, then `ctest --test-dir build` | ||
| - Whisper (optional): add `-DEDDY_ENABLE_WHISPER=ON` and set `OpenVINOGenAI_DIR` if not auto-discovered. | ||
| - Python: always use `uv` commands, never use `pip` directly. | ||
|
|
||
| ## Coding Style & Naming Conventions | ||
|
|
||
| - C++20; 2-space indentation; braces on the same line. | ||
| - Types/classes: PascalCase. Functions/variables/files: snake_case (e.g., `parakeet_openvino.cpp`). | ||
| - Keep headers under `include/eddy/...` mirrored by sources under `src/...`. | ||
| - Prefer small, focused functions; avoid inline comments unless clarifying non-obvious logic. | ||
|
|
||
| ## Testing Guidelines | ||
|
|
||
| - Place unit/integration tests under `tests/` (enable with `-DBUILD_TESTING=ON`); name files `<area>_test.cpp`. | ||
| - Manual checks: use `parakeet_cli` and `benchmark_librispeech` with short WAVs and limited file counts. | ||
| - Cover new logic; document any gaps in the PR. | ||
|
|
||
| ## Commit & Pull Request Guidelines | ||
|
|
||
| - Commits: imperative subject with optional scope (e.g., `parakeet: fix encoder port selection`). | ||
| - Keep changes focused; include rationale and before/after behavior. | ||
| - PRs should include summary, reproduction/validation steps, logs or screenshots, target device (CPU/NPU), and linked issues. | ||
|
|
||
| ## Agent-Specific Instructions (Parakeet/OpenVINO) | ||
|
|
||
| - After editing `src/models/parakeet/` or related headers, clear compiled caches but keep downloaded models (Windows: `%LOCALAPPDATA%\eddy\models\parakeet-v2`, preserve `files/`). Quick PowerShell: | ||
|
|
||
| ```powershell | ||
| $b = "$env:LOCALAPPDATA\eddy\models\parakeet-v2"; | ||
| if (Test-Path $b) { | ||
| Get-ChildItem $b -File | Remove-Item -Force | ||
| Get-ChildItem $b -Directory | ? { $_.Name -ne 'files' } | Remove-Item -Recurse -Force | ||
| } | ||
| ``` | ||
|
|
||
| - Rebuild: `cmake --build build --config Release --target eddy parakeet_cli benchmark_librispeech hf_fetch_models`. First run after a cache clear will recompile models and may take minutes. | ||
|
|
||
| ## Configuration & Models | ||
|
|
||
| - OpenVINO env: use `run_bench_npu.bat` to preload. If needed, set `OpenVINO_DIR` (e.g., `C:\Program Files (x86)\Intel\openvino_2025.0.0\runtime\cmake`). | ||
| - GenAI (Whisper): set `OpenVINOGenAI_DIR` when `EDDY_ENABLE_WHISPER=ON`. | ||
| - Download models: Models auto-download on first run via `hf_fetch_models`. Manual download: run `hf_fetch_models.exe` or visit <https://huggingface.co/FluidInference/parakeet-tdt-0.6b-v2-ov> (downloads into `%LOCALAPPDATA%\eddy\models\parakeet-v2\files`). | ||
| - Runtime knobs: `EDDY_OV_PERF`, `EDDY_OV_NUM_REQUESTS`, `EDDY_OV_THREADS`, `EDDY_CONTEXT_FRAMES`, `EDDY_BOUNDARY_SEARCH_FRAMES`, `EDDY_DISABLE_HOLDBACK=1`, `EDDY_DEDUP_PREV_TOKENS` (default 15). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACcording to GPT - precompiling the libraries and .d files are preferred over users manually installing deps for C++ SDKs but do push abck here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i gave it a try and it created like 25 extra files. maybe not in this pr