update readme #6

Alex-Wengg · 2025-11-03T22:42:45Z

No description provided.

claude · 2025-11-03T22:45:02Z

Pull Request Review: README Update

Thank you for this comprehensive README overhaul! This is a significant improvement that transforms the documentation from a work-in-progress outline into production-ready documentation. Below is my detailed review:

✅ Strengths

1. Excellent Structure & Organization

Clear hierarchy from quick start → detailed usage → architecture → troubleshooting
Badge integration (Discord, GitHub stars) adds credibility and community visibility
Well-organized sections with proper markdown formatting
Table of contents through clear heading structure

2. Strong Technical Content

Comprehensive quick start guide with concrete commands
Detailed model comparison table with WER/RTFx metrics
Clear performance benchmarks with device comparisons
Architecture explanation is accessible yet technically accurate

3. User-Focused Documentation

Multiple entry points for different user needs (quick start, installation, usage)
Concrete examples with actual commands users can copy-paste
Clear hardware requirements and platform support
Troubleshooting section addresses common issues

⚠️ Issues & Recommendations

Critical Issues

1. Missing benchmark_librispeech Executable (High Priority)

The README references benchmark_librispeech.exe in multiple places:

Line 48: Quick Start usage example
Line 79: Benchmarks section

Issue: This file does not exist in the repository. I only found:

benchmark_fleurs.cpp
benchmark_fleurs.py
parakeet_cli.cpp

Recommendation: Either:

Remove references to benchmark_librispeech.exe
Update to reference the actual benchmark tools available
Add the missing LibriSpeech benchmark implementation

Locations to fix:

# Line 48 - Update to actual available benchmark
build/examples/cpp/Release/benchmark_fleurs.exe --max-files 100 --device NPU

# Line 79 - Remove or update this benchmark claim
**Parakeet V2 - LibriSpeech test-clean (English)**
- WER: 2.76% | CER: 1.13% | RTFx: 41.36× (Intel NPU)
- Tested on: 100 samples, Intel Core Ultra 7 258V

2. Inconsistent WER Claims (High Priority)

The README shows conflicting WER numbers:

Line 60: "2.76% WER" (in model table)
Line 79: "2.76% WER" (LibriSpeech benchmark)
BUT claude.md line 28: "1.27% WER (better than FluidAudio's 2.2% baseline)"

Issue: Which WER is correct? This is a critical accuracy metric.

Recommendation: Verify the actual benchmark results and use consistent numbers throughout all documentation. Update both README.md and claude.md to match.

3. Missing Referenced Files (Medium Priority)

The README references several files that don't exist:

Line 106: FLEURS_BENCHMARK.md - File not found
Line 107: PARAKEET_V2_MODEL_CARD.md - File not found
Line 108: README_V3_HUGGINGFACE.md - File not found

Recommendation: Either create these files or remove the references.

4. Documentation Inconsistency (Medium Priority)

The README focuses entirely on Parakeet v2/v3, but:

getting-started.md only covers Whisper setup
performance-benchmarks.md only covers Whisper benchmarks
No Parakeet-specific getting started guide exists

Issue: Users following the README will be confused when they try to reference the linked documentation.

Recommendation: Either:

Update getting-started.md to cover Parakeet setup
Create getting-started-parakeet.md and getting-started-whisper.md
Clearly indicate in the README which model families are covered by which docs

Minor Issues

5. Device Option Discrepancy (Low Priority)

Line 24 in parakeet_cli.cpp: Only shows "CPU, AUTO" as options
Line 41 in README: Claims NPU support with --device NPU

Issue: The CLI help text doesn't mention NPU, but README examples use it.

Recommendation: Update the CLI help text to include NPU option, or clarify device selection behavior (e.g., "AUTO may select NPU").

6. Python API Marked "Coming Soon" (Low Priority)

Line 169-179 shows Python API examples marked "Coming Soon"

Recommendation: Since this is aspirational, consider:

Moving to "Roadmap" section instead
Adding timeline estimate
Or clarifying current C API status

7. Emoji Usage (Style)

Line 61: Uses flag emojis for languages (🇬🇧 🇪🇸 🇮🇹...)

Consideration: Flag emojis can:

Cause rendering issues in some terminals
Be controversial (e.g., language ≠ country)
Bloat file size slightly

Not a blocker, but consider plain text alternative: "EN, ES, IT, FR..."

8. C++ API Example May Be Outdated (Low Priority)

Lines 186-197 show C++ API usage:

auto asr = eddy::ParakeetASR::create("parakeet-v3", "NPU");

Recommendation: Verify this API actually exists in include/eddy/parakeet_inference.h (file not checked in this review). If not yet implemented, mark as "Coming Soon" like Python API.

9. Roadmap Completion Status (Low Priority)

Line 246: "⏳ Python bindings (C API complete, wrapper in progress)"

Issue: Conflicts with line 169 where Python API is marked "Coming Soon" with no mention of partial completion.

Recommendation: Align the messaging - either both say "in progress" or both say "coming soon".

📋 Style & Best Practices

Positive

✅ Good use of code blocks with language hints
✅ Clear emoji indicators for checklist items (✅ ⏳ 🔮)
✅ Tables are well-formatted and readable
✅ Commands use absolute paths for clarity
✅ Consistent markdown formatting

Suggestions

Consider adding a "Table of Contents" at the top for easier navigation
The README is quite long (~249 lines); consider splitting into:
- README.md (overview + quick start)
- INSTALLATION.md (detailed setup)
- USAGE.md (examples and API)
- BENCHMARKS.md (performance data)
Add dates to benchmark results for transparency ("as of March 2025")
Consider adding a "Quick Links" section at the top

🔒 Security Considerations

No security issues identified. The README:

✅ Doesn't contain hardcoded credentials
✅ Links to official sources (HuggingFace, GitHub)
✅ Uses HTTPS for all URLs
✅ Doesn't suggest running untrusted code

📊 Summary

Category	Status	Count
Critical Issues	🔴	4
Medium Issues	🟡	0 (included in critical)
Minor Issues	🟢	5
Style Suggestions	💡	Multiple

✍️ Recommended Actions

Before Merge (Blockers)

✅ Verify and fix benchmark executable references (benchmark_librispeech)
✅ Reconcile WER discrepancies (2.76% vs 1.27%)
✅ Create missing referenced files OR remove references:
- FLEURS_BENCHMARK.md
- PARAKEET_V2_MODEL_CARD.md
- README_V3_HUGGINGFACE.md
✅ Align documentation strategy (Whisper vs Parakeet guides)

After Merge (Follow-up)

Update getting-started.md for Parakeet
Add dates to benchmark claims
Consider splitting README into multiple focused docs
Update CLI help text to match README claims

🎯 Final Verdict

This is excellent work that significantly improves the project's documentation. The content is comprehensive, well-organized, and user-focused. However, there are critical accuracy issues (missing executables, inconsistent WER numbers, broken file references) that must be addressed before merging to avoid user confusion.

Recommendation: Request changes to fix critical issues, then approve.

Great job on this documentation overhaul! Once the critical issues are resolved, this will be a fantastic resource for users. Let me know if you need help creating any of the missing files or reconciling the benchmark numbers. 🚀

README.md

BrandonWeng · 2025-11-04T02:34:20Z

theres a lot of duplicate information like WER, "fast" english benchmark, and description that you should try to clean up

claude · 2025-11-04T03:00:40Z

PR Review: README Update

Great work on transforming the README from work-in-progress to production-ready documentation! This is a significant improvement that makes the project much more accessible.

✅ Strengths

1. Excellent Structure & Organization

Clear progression from quick start → models → performance → architecture
Effective use of collapsible sections to reduce visual clutter
Well-organized navigation with linked sections

2. User-Focused Content

Python Quick Start section is perfect for lowering the barrier to entry
Multiple usage examples (CLI, Python, C++) cater to different audiences
Troubleshooting section anticipates common issues

3. Comprehensive Technical Details

Performance metrics (RTFx, WER) with clear context
Architecture explanation helps users understand the system
Hardware requirements and optimization guidance

4. Professional Presentation

Badges for Discord and GitHub stars add credibility
Proper citations and acknowledgments
Clear licensing information

claude · 2025-11-04T03:00:59Z

⚠️ Issues & Concerns - Part 1: Technical Accuracy

1. Platform-specific paths (Lines 67-68, 179, etc.)

Uses .exe extension throughout, but claims Linux support. This will confuse Linux users.

build/examples/cpp/Release/parakeet_cli.exe audio.wav --model parakeet-v2 --device NPU

❌ Recommendation: Show both platforms or use platform-agnostic examples

2. Performance claim inconsistency (Line 95)

README claims 40-45× RTFx on NPU, but claude.md says "5-8x RTF on CPU" with no NPU metrics. Is the 40-45× claim verified?

3. Path inconsistency (Lines 172-175)

README says eddy\models\ but claude.md says eddy\cache\models. Which is correct?

4. Windows-only command (Line 177)

set EDDY_DISABLE_AUTO_FETCH=1 is Windows cmd syntax. Linux users need export EDDY_DISABLE_AUTO_FETCH=1

claude · 2025-11-04T03:01:10Z

⚠️ Issues & Concerns - Part 2: Incomplete/Misleading Content

1. Missing Python example context (Line 37)

References benchmark_fleurs.py but does not explain where this file is or if it exists in the repo.

2. Conflicting Python API status (Lines 37 vs 181-193)

Quick Start shows Python usage but Python API section is marked "Coming Soon". This creates confusion about what is actually available.

❌ Recommendation: Clarify what works now vs what is planned. If Python bindings exist via ctypes, show the real API. If they do not exist, remove the Quick Start Python example or mark it as planned.

3. Misleading roadmap status (Line 218)

Says "C API complete" but there is no indication of where this C API is documented or how to use it.

4. Unverified file references

The README references several files that should be verified to exist:

FLEURS_BENCHMARK.md (Line 101)
PARAKEET_V2_MODEL_CARD.md (Line 125)
README_V3_HUGGINGFACE.md (Line 127)
documentation/ directory (Line 287) - old README said docs/

claude · 2025-11-04T03:01:20Z

🔧 Suggestions for Improvement

1. Add System Requirements section

Before Quick Start, add explicit requirements:

Windows 10/11 or Linux (Ubuntu 20.04+)
4GB RAM minimum, 8GB recommended
Intel Core Ultra CPU for NPU support (optional)

2. Improve Linux support visibility

The documentation feels Windows-first. Consider:

Using platform-agnostic examples where possible
Adding a Platform Notes section
Testing all commands on both platforms

3. Be more specific with performance claims (Line 10)

Change "Up to 45× real-time on NPU" to "40-45× real-time on Intel Core Ultra NPU, 5-8× on CPU" to set proper expectations.

4. Add Getting Help section

Currently help information is buried. Add a prominent section near the top with Discord, GitHub Issues, and Documentation links.

5. Verify all performance claims before merge

Ensure all RTFx numbers, WER percentages, and hardware requirements are tested and reproducible.

claude · 2025-11-04T03:01:33Z

🎯 Action Items

Must Fix Before Merge:

✅ Resolve path inconsistency: eddy\models\ vs eddy\cache\models\
✅ Fix platform-specific commands (.exe, set vs export)
✅ Clarify Python API status (available now vs coming soon)
✅ Verify all file references actually exist
✅ Verify performance claims (40-45× RTFx on NPU)

Should Fix:
6. Add platform-specific command variants (Windows/Linux)
7. Document the C API if it is "complete"
8. Add system requirements section
9. Verify and update all linked documentation files

Nice to Have:
10. Add screenshots or GIFs of CLI usage
11. Add FAQ section
12. Add Quick Links at the top for common tasks

📊 Overall Assessment

Quality: ⭐⭐⭐⭐ (4/5)
Completeness: ⭐⭐⭐ (3/5) - Missing Linux-specific details
Accuracy: ⚠️ Needs verification - Several inconsistencies with claude.md

This is a significant improvement over the WIP README, but needs some technical accuracy fixes before merge. The structure and content are excellent, but the inconsistencies between this README and the project internal guidelines (claude.md) suggest some claims may need verification.

Recommendation: Request changes to fix path inconsistencies, platform-specific commands, and verify performance claims before merging.

Alex-Wengg · 2025-11-04T03:44:45Z

README.md

+# eddy

-eddy is a C++ inference library designed for native runtimes and multi-vendor edge NPUs, exposing a consistent C++ API plus language bindings for app developers (C#; more to follow). The current milestone focuses on the OpenVINO 2025.x backend for the Parakeet-TDT speech model family while we bring additional runtimes online.
+**eddy** is a high-performance audio-based AI library for Windows and Linux optimized for Intel NPUs and CPUs; eddy powers private, offline automatic speech recognition (ASR) with support for multiple hardware accelerators planned.


iirc the encoders for ASR had to be finetuned for GPU or NPU . mutually exclusive.

becareful of the use of the word “fine tuned” in ML. it’s very overloaded. usually it means you’re doing additional training on the model

claude · 2025-11-04T03:47:08Z

PR Review: Update README

Summary

This PR significantly improves the project documentation by transforming the README from a work-in-progress development guide into a comprehensive, production-ready document. It also adds detailed benchmark results in a separate file.

Strengths

Excellent Documentation Quality

Clear, structured content with proper sections for different audiences
Professional presentation with badges, quick start guides, and troubleshooting
Strong emphasis on key features (private, multilingual, cross-platform)
Good use of collapsible sections

Comprehensive Benchmark Data

Detailed performance metrics across multiple datasets (LibriSpeech, FLEURS)
Transparent comparison with reference implementations
Clear methodology documentation

Critical Issues

Inconsistent Benchmark Claims

README claims WER 2.76% for Parakeet V2
BENCHMARK_RESULTS.md shows Average WER 2.87% for Parakeet V2
Action: Update README to match actual benchmark result (2.87%)

V3 English WER Discrepancy

README claims WER 6.09% English for V3
BENCHMARK_RESULTS.md shows Average WER 3.7% for V3 on LibriSpeech
BENCHMARK_RESULTS.md shows English (US) 6.1% on FLEURS
Clarification needed: Specify which dataset when stating V3 English WER

Outdated claude.md Reference

claude.md shows 1.27% WER on LibriSpeech
New benchmark shows 2.87% WER for V2
Action: Update claude.md to reflect current benchmark results

High Priority

Missing CPU Benchmark Data

README states 5-8x RTFx (CPU) but BENCHMARK_RESULTS.md lacks CPU benchmarks
Recommendation: Add CPU benchmark data or mark as estimated

FluidAudio Comparison Needs Clarification

README claims FluidAudio v2 has 2.2% WER
claude.md indicates 2.2% is the baseline and current achieves 1.27%
Action: Clarify whether 2.2% is FluidAudio actual result or outdated

Medium Priority

Installation path inconsistencies between README and claude.md
Python binding status unclear (marked in progress but examples work)
Consider adding RTFx definition earlier in document

Security & Testing

No Security Concerns

Documentation-only changes
Model download from trusted HuggingFace repo

Test Coverage

N/A for documentation changes
Recommendation: Add CI benchmark validation to prevent documentation drift

Action Items Before Merge

Reconcile WER discrepancies (2.76% vs 2.87% for V2)
Clarify V3 English WER (specify dataset)
Update claude.md with current benchmark results
Verify and standardize model cache paths
Add CPU benchmark data or mark as estimated
Clarify FluidAudio comparison data

Conclusion

Strong documentation update that improves project accessibility and professionalism. Main issues are data consistency problems. Once WER discrepancies are resolved, this will be ready to merge.

Recommendation: Request changes to address critical WER inconsistencies, then approve.

Overall Assessment: Approve after corrections - Great work, needs data verification.

README.md

BrandonWeng · 2025-11-04T12:12:44Z

README.md

+**Usage:**
+
+```bash
+# Transcribe with Parakeet V2


i just realized this… btw i’m pretty sure this inspection only works for windows.

to test linux you probably have to run in the “wsl 2” environment.

this also reminds me, I actually forget if intel even releases npu drivers for linux lol. can you double check

verified, i also added note for we haven't tested linux but it should be supported

https://github.com/FluidInference/eddy/pull/6/files#diff-b335630551682c19a781afebcf4d07bf978fb1f8ac04c6bf87428ed5106870f5R39

BrandonWeng · 2025-11-04T12:14:32Z

README.md

+**Apache 2.0** - See [LICENSE](LICENSE) for details.

-To disable auto-download: set `EDDY_DISABLE_AUTO_FETCH=1`
+Third-party model licenses may vary. See [THIRDPARTY_LICENSES](THIRDPARTY_LICENSES.md) for details on Parakeet TDT models (CC-BY-4.0) and other dependencies.


it should be a folder, see fluid audio. also i don’t think you added the file

BrandonWeng · 2025-11-04T12:14:54Z

README.md

+## Links

-Refer to `docs/parakeet_openvino.md` for instructions on pulling the exported model from Hugging Face and running a smoke test through the new `OpenVINOParakeet` wrapper.
+- **Discord**: [Join Community](https://discord.gg/WNsvaCtmDe)


claude · 2025-11-04T16:34:22Z

PR Review: Update README and Add Documentation

Summary

This PR significantly enhances the project documentation by transforming the README from a work-in-progress technical document into a comprehensive, user-friendly guide. It also adds two new documentation files: BENCHMARK_RESULTS.md and ThirdPartyLicenses/README.md.

✅ Strengths

1. Excellent Documentation Quality

Clear, structured, and user-focused README that effectively communicates the project's value proposition
Professional presentation with badges, collapsible sections, and well-organized content
Comprehensive coverage from quick start to troubleshooting

2. Strong Benchmark Documentation

Detailed BENCHMARK_RESULTS.md with comprehensive metrics (WER, CER, RTFx) across multiple datasets
24-language FLEURS benchmark results with reference comparisons
Clear methodology section explaining the approach

3. Proper License Attribution

Excellent addition of ThirdPartyLicenses/README.md
Comprehensive coverage of all dependencies with correct license types
Proper attribution for models (CC-BY-4.0), datasets, and libraries

4. User Experience Focus

Multiple entry points for different user types (Python users, C++ developers)
Platform-specific instructions (Windows vs Linux)
Clear quick start examples with expected outputs

🔍 Issues & Concerns

1. Data Inconsistencies ⚠️

Critical Issue - README vs BENCHMARK_RESULTS.md discrepancy:

README.md (line 101) states:

- **WER**: 2.76% (LibriSpeech test-clean)

BENCHMARK_RESULTS.md (line 19) shows:

| **Average WER** | 2.87% |

Additional inconsistencies:

README.md (line 108) states:

- **WER**: 6.09% English, 17.0% average across all languages (FLEURS)

BENCHMARK_RESULTS.md (line 63) shows:

| **English (US)** | 6.1% | 4.9% | 3.0% | 41.9× | 350 |

While 6.09% vs 6.1% is minor, the V2 WER discrepancy (2.76% vs 2.87%) is significant and needs to be resolved.

Recommendation: Verify which numbers are correct and ensure consistency across all documentation. The difference between 2.76% and 2.87% is ~4% relative error, which matters for benchmark claims.

2. RTFx Performance Claims ⚠️

README.md claims:

- **Speed**: 38× RTFx (NPU), 5-8× RTFx (CPU)  # V2
- **Speed**: 41× RTFx (NPU), 5-8× RTFx (CPU)  # V3

BENCHMARK_RESULTS.md shows:

- V2 NPU: 37.8× RTFx
- V3 NPU: 25.7× RTFx (LibriSpeech) vs 41.1× RTFx (FLEURS average)

Issues:

README V2 claims "38× RTFx" but benchmark shows "37.8×" (minor rounding, acceptable)
README V3 claims "41× RTFx" but which benchmark? LibriSpeech shows 25.7× while FLEURS shows 41.1×
No CPU benchmark data in BENCHMARK_RESULTS.md to support the "5-8× RTFx (CPU)" claim

Recommendation:

Clarify which benchmark the README performance numbers refer to
Add footnote explaining dataset differences in RTFx
Add CPU benchmark data or remove unsupported CPU claims

3. Linux NPU Support Claims

README.md (line 39) states:

> **Note for Linux users:** ... Linux NPU has not been tested yet.

Later (line 228):

> **Note:** Linux NPU support has not been tested yet.

Yet the README presents Linux as a fully supported platform in the overview (line 12):

- **Cross-platform**: Windows 10/11, Linux (Ubuntu 22.04+, kernel 6.6+)

Issue: The messaging is confusing. Is Linux NPU supported but untested, or not supported?

Recommendation: Clarify the Linux support status. Consider:

"Linux: CPU-only (tested), NPU support experimental (driver available but untested)"
Or move Linux to "Experimental/Untested Platforms" section

4. Documentation Link Issues

README.md (line 315) links to:

- **Documentation**: [docs/](documentation/)

But the path should be documentation/ not docs/. This is a dead link.

README.md (line 135) references:

See [FLEURS_BENCHMARK.md](FLEURS_BENCHMARK.md) for detailed...

This file doesn't exist in the PR. Either:

The file needs to be added
The reference should be removed
The reference should point to an existing file

5. claude.md vs README.md Conflicts

claude.md (line 28) states:

**Current Results:** 1.27% WER (better than FluidAudio's 2.2% baseline)

But README.md and BENCHMARK_RESULTS.md show 2.76-2.87% WER.

Issue: The claude.md file appears to be outdated and contains stale information that contradicts the new documentation.

Recommendation: Update claude.md to reflect current benchmark results or add a note that it contains development history/old results.

6. Minor Issues

Typo in BENCHMARK_RESULTS.md (line 58):

**Dataset**: FLEURS (Federated Learning Evaluation Representation United States)

The acronym expansion is incorrect. FLEURS stands for "Few-shot Learning Evaluation of Universal Representations of Speech", not "Federated Learning Evaluation Representation United States".

Citation year (README.md line 294):

year={2025}

Should this be 2025 or 2024? The project appears to have been developed in 2024-2025.

📋 Best Practices Recommendations

1. Version Consistency

Consider adding a VERSION file or using semantic versioning to track which benchmarks correspond to which release.

2. Benchmark Reproducibility

Add date/timestamp to benchmark results so users know when they were run and can assess if they're current.

3. Documentation Maintenance

Consider adding a documentation review checklist to your PR process to catch inconsistencies like these.

4. Benchmark Data Sources

Consider adding machine-readable benchmark data (JSON/CSV) alongside the markdown tables for easier verification and tracking.

🔒 Security Review

✅ No security concerns identified:

Proper attribution of licenses
No credentials or secrets exposed
Safe download instructions with official sources only
Appropriate warnings about NPU driver installation

🧪 Test Coverage

Not applicable - This PR only modifies documentation files. No code changes to test.

However, the README claims tests can be run but there's no mention of:

Test suite existence
How to run tests
Expected test coverage

Recommendation: Consider adding a "Testing" section if tests exist.

📝 Recommendations Priority

High Priority (Must Fix)

✅ Resolve WER inconsistency (2.76% vs 2.87%)
✅ Fix broken documentation link (docs/ vs documentation/)
✅ Add missing FLEURS_BENCHMARK.md or remove reference
✅ Clarify Linux NPU support status

Medium Priority (Should Fix)

✅ Clarify RTFx performance claims and dataset context
✅ Fix FLEURS acronym
✅ Update claude.md to match current benchmarks
✅ Verify citation year

Low Priority (Nice to Have)

⚪ Add CPU benchmark data
⚪ Add benchmark timestamps
⚪ Add testing section if applicable

✨ Overall Assessment

Score: 8/10

This is an excellent documentation update that significantly improves the project's accessibility and professionalism. The README transformation from a technical WIP to a user-friendly guide is well-executed. The addition of benchmark results and license documentation shows good open-source practices.

However, the data inconsistencies between files are concerning and must be addressed before merge. These aren't just formatting issues—they affect the project's credibility and users' ability to set accurate performance expectations.

Recommendation: Request Changes

Fix the critical data inconsistencies
Resolve broken links
Clarify Linux support status

Once these issues are addressed, this will be a great addition to the project.

Review completed using repository guidelines from claude.md

BrandonWeng · 2025-11-04T17:28:32Z

ThirdPartyLicenses/README.md

+- **Description**: C++ package manager for dependency management
+
+### CMake
+- **License**: BSD-3-Clause


overkill now.

just lost nvidia parakeet whisper and openvino

BrandonWeng · 2025-11-04T17:28:56Z

BENCHMARK_RESULTS.md

+Comprehensive benchmark results for eddy ASR on LibriSpeech test-clean and FLEURS multilingual datasets.
+
+**Hardware**: Intel Core Ultra 7 155H (Meteor Lake) with Intel AI Boost NPU
+**Software**: OpenVINO 2025.x


list actual version

claude · 2025-11-04T17:29:25Z

Pull Request Review: Update README

Overview

This PR significantly improves the documentation and adds valuable supplementary materials. The changes transform the README from a WIP document to a comprehensive, production-ready guide. Overall, this is a strong contribution that enhances the project's usability and professionalism.

Positive Highlights

1. Documentation Quality ✅

Excellent transformation: From sparse WIP README to comprehensive, user-friendly documentation
Clear structure: Well-organized sections with Quick Start, Models, Architecture, and Troubleshooting
Professional presentation: Good use of badges, code examples, and formatting
Accessibility: Multiple entry points (Python, C++, benchmarks) cater to different user levels

2. Benchmark Results ✅

Comprehensive data: BENCHMARK_RESULTS.md provides detailed performance metrics
Good comparisons: Clear benchmarks against FluidAudio (CoreML) baseline
Multilingual coverage: Impressive FLEURS benchmark with 24 languages
Transparency: Honest reporting of accuracy delta vs reference (4.3% higher WER)

3. Legal Compliance ✅

Third-party licenses: Proper attribution for dependencies, models, and datasets
License clarity: Clear licensing information (Apache 2.0, CC-BY-4.0, LGPL)
Attribution requirements: Explicitly documented for downstream users

4. New Benchmark Script ✅

Whisper comparison: benchmark_whisper_ov.py enables direct comparison with Whisper
Good structure: Clean code with separate functions for PyTorch vs OpenVINO
User-friendly: Auto-download functionality and clear error messages

Issues & Recommendations

1. Inconsistency in WER Reporting ⚠️

Issue: The README and BENCHMARK_RESULTS.md report different WER values for the same model/dataset:

README line 98: "WER: 2.76% (LibriSpeech test-clean)"
BENCHMARK_RESULTS.md line 19: "Average WER | 2.87%"
README line 107: "WER: 6.09% English"
BENCHMARK_RESULTS.md line 38: "Average WER | 3.7%"

Recommendation:

Verify which numbers are correct
Ensure consistency across all documentation
If these are different metrics (e.g., average vs median), clarify in the text

2. Hardcoded Test Files in benchmark_whisper_ov.py ⚠️

Issue (lines 162-167):

test_files = [
    "test_0011.wav",
    "test_0021.wav",
    "test_0037.wav",
    "test_1089_134686_0012.wav"
]

Problems:

No path information (assumes current directory)
No fallback if files don't exist
Users can't easily run the script without these specific files

Recommendation:

import argparse

parser = argparse.ArgumentParser()
parser.add_argument('--audio-files', nargs='+', required=True, help='Audio files to benchmark')
args = parser.parse_args()
test_files = args.audio_files

Or provide a default path to LibriSpeech samples with a clear error message.

3. Potential Division by Zero ⚠️

Issue (line 208):

print(f"  Speedup (NPU vs PyTorch):   {pytorch_rtf / ov_rtf_npu:.2f}x faster")

Problem: If ov_rtf_npu is 0 (unlikely but possible), this will crash.

Recommendation:

if ov_rtf_npu and ov_rtf_npu > 0:
    print(f"  Speedup (NPU vs PyTorch):   {pytorch_rtf / ov_rtf_npu:.2f}x faster")

4. Linux NPU Support Clarity ⚠️

Issue (README lines 102-107): The troubleshooting section states "Linux NPU support has not been tested yet" but the main Quick Start section doesn't warn users about this.

Recommendation: Add a note in the Quick Start section:

> **Note**: NPU support on Linux is currently untested. CPU inference is recommended for Linux users at this time.

5. Missing Error Handling in download_whisper_ov_model() ⚠️

Issue (lines 142-158): No try/except around snapshot_download, which can fail due to network issues, authentication, or storage problems.

Recommendation:

def download_whisper_ov_model():
    try:
        from huggingface_hub import snapshot_download
    except ImportError:
        print("ERROR: huggingface_hub not installed. Install with:")
        print("  pip install huggingface_hub")
        return None
    
    try:
        model_id = "FluidInference/whisper-large-v3-turbo-fp16-ov-npu"
        cache_dir = Path.home() / ".cache" / "eddy" / "models" / "whisper-large-v3-turbo"
        
        print(f"Downloading {model_id}...")
        model_path = snapshot_download(
            repo_id=model_id,
            cache_dir=cache_dir,
            local_dir=cache_dir / "files",
            local_dir_use_symlinks=False
        )
        
        print(f"Model downloaded to: {model_path}")
        return model_path
    except Exception as e:
        print(f"ERROR downloading model: {e}")
        return None

6. README Model Size Discrepancy ℹ️

Minor issue: The README states Parakeet V3 is "1.1GB" (line 109) but doesn't specify if this is the quantized int8 size or fp16/fp32. Given NPU optimization mentions int8 quantization, clarify this.

Recommendation: "Size: 1.1GB (int8 quantized for NPU)"

7. Broken Reference Link ⚠️

Issue (BENCHMARK_RESULTS.md line 135):

See [FLEURS_BENCHMARK.md](FLEURS_BENCHMARK.md) for detailed FLEURS benchmark methodology

Problem: This file doesn't exist in the PR.

Recommendation: Either add the file or remove the reference.

Security Considerations

No Critical Security Issues ✅

No credentials or API keys exposed
No unsafe file operations
Dependencies are from reputable sources
Proper license attribution prevents legal issues

Performance Considerations

Benchmark Results Look Good ✅

37-41× RTFx on NPU is impressive
Consistent performance across languages
WER values are competitive for real-time ASR
Honest about accuracy delta vs reference implementation

Test Coverage

Documentation-Only PR ✅

No code changes to test
Benchmark scripts are utility scripts (not core library)
Would benefit from integration tests in future PRs

Summary

Strengths

Massive improvement in documentation quality
Comprehensive benchmark data with transparency
Proper legal attribution
User-friendly quick start guides

Must Fix Before Merge

WER inconsistencies between README and BENCHMARK_RESULTS.md
Broken link to FLEURS_BENCHMARK.md

Should Fix

Hardcoded test files in benchmark_whisper_ov.py
Missing error handling in download function
Linux NPU support warning in Quick Start

Nice to Have

Command-line arguments for benchmark script
Division by zero safeguards
Model size clarification (int8 vs fp16)

Recommendation

Approve with minor changes requested. The documentation improvements are excellent and add significant value. The issues identified are mostly minor and easy to fix. The WER inconsistencies should be resolved before merge, but the rest can be addressed in follow-up PRs if time is a factor.

Great work on transforming this into a production-ready README! 🚀

BrandonWeng · 2025-11-04T17:29:30Z

benchmark_whisper_ov.py

+"""
+Benchmark Whisper OpenVINO (eddy optimized) vs PyTorch Whisper
+Compares FluidInference whisper-large-v3-turbo-fp16-ov-npu with OpenAI Whisper
+"""


wrong folder?

BrandonWeng · 2025-11-04T17:30:18Z

benchmark_whisper_ov.py

+from pathlib import Path
+import librosa
+
+def benchmark_whisper_pytorch(audio_files, model_name="base.en"):


base.en is pretty useless most of the time. run against the whisper v3 turbo model like the openvino model for a 1-1 comparison?

BrandonWeng · 2025-11-04T17:31:39Z

README.md

+
+### Benchmarks
+
+We also provide [Whisper large-v3-turbo OpenVINO](https://huggingface.co/FluidInference/whisper-large-v3-turbo-fp16-ov-npu) optimized for NPU (16× RTFx). Run `python benchmark_whisper_ov.py` to compare Parakeet vs Whisper performance.


add a small table for rough RTFx for all 3 models and note the intel chip used in the root README.md so users can quickly get an overview

also, very surprised that parakeet is only 2x faster?

which whisper model did you run to get this RTFx

https://huggingface.co/FluidInference/whisper-large-v3-turbo-fp16-ov-npu

BrandonWeng · 2025-11-04T17:33:24Z

README.md

+
+See [BENCHMARK_RESULTS.md](BENCHMARK_RESULTS.md) for detailed performance metrics.
+
+## Architecture


this doesn’t make sense. it doesn’t apply to whisper?

claude · 2025-11-04T23:12:05Z

Pull Request Review: Documentation Overhaul

Overview

This PR adds comprehensive documentation including benchmarks, API guides, and third-party licenses. The documentation is well-structured and provides valuable information for users. However, there are several issues that need to be addressed before merging.

Critical Issues

1. Incorrect Benchmark Executable Reference ⚠️

Location: BENCHMARK.md:58-63, CPP_API.md:205

Issue: References to benchmark_librispeech.exe don't exist in the codebase.

# Build the benchmark
cmake --build build --config Release --target benchmark_librispeech

# Run on 25 files
build/examples/cpp/Release/benchmark_librispeech.exe --max-files 25

Evidence:

examples/cpp/CMakeLists.txt only defines benchmark_fleurs executable (line 13)
No benchmark_librispeech target exists in the build system
The actual LibriSpeech benchmarking is done via Python script: benchmarks/benchmark.py

Fix Required:
Replace references to benchmark_librispeech.exe with the correct Python-based approach:

cd benchmarks
uv run benchmark.py --max-files 25

2. Inconsistent Benchmark Claims

Location: BENCHMARK_RESULTS.md:26, README.md:71-73

Issue: The PR claims "eddy NPU provides 3-5× acceleration over PyTorch GPU" but the benchmark data doesn't support this for all models.

Data Analysis:

Parakeet V2: 38× (eddy NPU) vs 8.4× (PyTorch GPU) = 4.5× speedup ✓
Parakeet V3: 41× (eddy NPU) vs 8.4× (PyTorch GPU) = 4.9× speedup ✓
Whisper: 16× (eddy NPU) vs 5.5× (PyTorch GPU) = 2.9× speedup ⚠️

The claim should be "3-5×" but Whisper only achieves 2.9×, which is below the stated range.

Fix Required:
Either adjust the claim to "2.9-4.9× acceleration" or clarify that Parakeet models achieve 4.5-4.9× while Whisper achieves 2.9×.

3. Footnote Reference Issues

Location: README.md:80-81

Issue: Footnote markers ¹ and ² are used but the footnotes are placed ambiguously.

| **Parakeet V2** | 38× RTFx | 8× RTFx | 8.4× RTFx¹ | 2× RTFx² | **4.5× faster** |
| **Parakeet V3** | 41× RTFx | 8× RTFx | 8.4× RTFx¹ | 2.5× RTFx² | **4.9× faster** |

¹ Benchmarked using NeMo parakeet-tdt_ctc-110m as proxy (similar architecture)
² Estimated based on NeMo reference implementations

Issue: Both footnotes appear on separate lines, making it unclear which applies to which column. This creates ambiguity about whether PyTorch GPU values are benchmarked or estimated.

Fix Required:
Improve formatting or add clarifying text:

*Note: PyTorch GPU values (¹) benchmarked using NeMo parakeet-tdt_ctc-110m as proxy. PyTorch CPU values (²) estimated from NeMo reference implementations.*

Documentation Quality Issues

4. Conflicting Documentation Structure

Location: New files vs existing /documentation directory

Issue: The PR adds top-level BENCHMARK.md, BENCHMARK_RESULTS.md, and docs/CPP_API.md, but there's already an existing /documentation directory with:

documentation/getting-started.md
documentation/performance-benchmarks.md

Concerns:

Duplicate/overlapping content (performance benchmarks appear in both places)
No clear hierarchy or navigation between documentation sets
Risk of documentation becoming stale and inconsistent

Recommendation:

Consolidate documentation into a single structure
Add a documentation index/table of contents
Consider whether performance data should live in /documentation/performance-benchmarks.md or BENCHMARK_RESULTS.md

5. Missing Benchmark Methodology Details

Location: BENCHMARK_RESULTS.md:125-130

Issue: The methodology section mentions implementation details but doesn't explain how the benchmarks were run.

Missing Information:

What hardware was used? (only mentions Intel Core Ultra 7 155H at the top)
What was the testing date/OpenVINO version?
Was this single-run or averaged over multiple runs?
What were the environmental conditions (other processes running, etc.)?
How were outliers handled?

Recommendation:
Add a "Reproducibility" section with:

## Reproducibility

- **Hardware:** Intel Core Ultra 7 155H (Meteor Lake, 16 cores, 22 threads)
- **OS:** Windows 11 23H2
- **OpenVINO:** 2025.0.0
- **Test Date:** November 2025
- **Runs:** Each file tested once, no outlier filtering
- **Conditions:** Minimal background processes, AC power mode

6. Unclear Language Support Claims

Location: BENCHMARK.md:107-135, README.md:92

Issue: Parakeet V3 language support is listed in two different places with potential inconsistencies.

In BENCHMARK.md (24 languages listed):
English, Spanish, Italian, French, German, Dutch, Russian, Polish, Ukrainian, Slovak, Bulgarian, Finnish, Romanian, Croatian, Czech, Swedish, Estonian, Hungarian, Lithuanian, Danish, Maltese, Slovenian, Latvian, Greek

In README.md (24 languages mentioned):
"24 languages" but then lists: "English, Spanish, Italian, French, German, Dutch, Russian, Polish, Ukrainian, Slovak, Bulgarian, Finnish, Romanian, Croatian, Czech, Swedish, Estonian, Hungarian, Lithuanian, Danish, Maltese, Slovenian, Latvian, Greek"

Count: Both list exactly 24 languages. ✓

However: The language code list in BENCHMARK.md (lines 121-135) should be verified against the FLEURS benchmark results table to ensure all tested languages are properly documented.

7. Python Benchmark Script Inconsistency

Location: BENCHMARK.md:25-29

Issue: References to benchmark_whisper_ov.py that doesn't exist.

# All Models Comparison
Compare Parakeet V2, V3, and Whisper on your hardware:

uv run python benchmarks/benchmark_whisper_ov.py

Evidence:

No benchmark_whisper_ov.py file exists in benchmarks/ directory
Only benchmarks/benchmark.py and benchmark_fleurs.py exist

Fix Required:
Either:

Remove this section if the script doesn't exist
Correct the filename if it's meant to reference a different script
Note that this script needs to be created

Code Quality & Best Practices

8. Good: Consistent Formatting

✅ All markdown files follow consistent formatting
✅ Tables are well-aligned and readable
✅ Code blocks use appropriate syntax highlighting

9. Good: Comprehensive Coverage

✅ API documentation includes examples
✅ Troubleshooting sections are helpful
✅ Performance metrics are detailed

10. Good: License Documentation

✅ Third-party licenses are properly documented
✅ Attribution requirements are clearly stated
✅ CC-BY-4.0 and other licenses are correctly identified

Performance Considerations

11. Benchmark Data Quality

Positive:

Large sample sizes (2,620 files for LibriSpeech, 15,000+ for FLEURS)
Industry-standard metrics (WER, RTFx)
Multiple device comparisons (NPU, CPU, GPU)

Concerns:

PyTorch comparison data appears to be estimated/proxy-based (footnotes ¹ and ²)
This should be more clearly disclosed upfront in the performance comparison table
Consider adding a disclaimer about estimation methodology

Security Concerns

12. Low Risk: No Security Issues

✅ Documentation-only changes
✅ No executable code modifications
✅ External links use HTTPS
✅ Model sources are properly attributed

Minor Note: The HuggingFace model auto-download feature (mentioned in README) should ensure model integrity verification. Verify that model downloads use checksums/hashes.

Test Coverage

13. N/A: Documentation PR

This PR adds documentation only. No code changes means no test coverage concerns.

However: The documentation claims should be validated by actual test runs:

Verify LibriSpeech WER values are accurate
Verify FLEURS benchmark results
Verify RTFx measurements
Validate that all documented commands actually work

Recommendations Summary

Must Fix Before Merge:

❌ Remove/correct benchmark_librispeech.exe references
❌ Fix benchmark_whisper_ov.py reference or remove section
⚠️ Clarify PyTorch comparison methodology and disclaimers
⚠️ Fix acceleration claim (3-5× vs actual 2.9-4.9×)

Should Fix:

📝 Improve footnote formatting in performance table
📝 Add reproducibility details to benchmark methodology
📝 Consolidate documentation structure (top-level vs /documentation/)

Nice to Have:

📚 Add a documentation index/navigation guide
📚 Cross-link between related documentation files
📚 Add "Last Updated" dates to benchmark results

Overall Assessment

Quality: Good documentation with valuable information
Completeness: Comprehensive coverage of features and benchmarks
Accuracy: Several factual errors that must be corrected
Recommendation: Request Changes - Fix critical issues before merge

Positive Aspects:

Well-structured and easy to navigate
Comprehensive API examples
Detailed benchmark data
Good troubleshooting sections

Areas for Improvement:

Verify all commands and file references
Clarify estimation vs measured data
Consolidate documentation structure
Add reproducibility details

Overall: This is a valuable documentation addition, but needs corrections to ensure accuracy and avoid user confusion. Once the referenced files/commands are fixed, this will be ready to merge.

claude · 2025-11-04T23:20:20Z

Pull Request Review: Documentation Updates

Summary

This PR adds comprehensive documentation including benchmark guides, results, API documentation, and third-party licenses. Total: 811 lines across 5 new files with significant README updates.

Strengths

1. Comprehensive Documentation Coverage

Excellent BENCHMARK.md with clear instructions
Detailed BENCHMARK_RESULTS.md with real-world metrics
Well-structured CPP_API.md with practical examples
Proper licensing documentation

2. README Improvements

User-friendly with badges and quick start examples
Excellent performance comparison table
Helpful troubleshooting section
Good formatting throughout

3. Benchmark Documentation Quality

Clear metric explanations (RTFx, WER, Confidence)
Good dataset information
Comprehensive language support (24 languages)
Practical hardware recommendations

Critical Issues

1. Broken Internal Link

BENCHMARK_RESULTS.md:133 references FLEURS_BENCHMARK.md which does NOT exist.

Action Required: Add the missing file or remove the reference.

2. API Documentation Needs Verification

CPP_API.md shows API examples that need verification:

Does eddy::ParakeetASR::create() exist as shown?
Are namespace and field names accurate?

Recommendation: Verify against actual code in include/eddy/

3. Whisper Support Confusion

README: Whisper is optional, requires OpenVINO GenAI
README: Shows Whisper in performance table with numbers
BENCHMARK_RESULTS.md: No Whisper results

Recommendation: Either add Whisper results or move to separate section

Major Issues

4. Benchmark Reproducibility

Performance numbers lack reproducibility context:

No variance information
No raw data links

Recommendation: Add note about expected variance

5. PyTorch Comparison Disclaimer Needed

Comparison uses estimated numbers:

Proxy model used (110M vs 600M params)
Estimated rather than measured

Recommendation: Add disclaimer or run actual benchmarks

6. Linux NPU Status Unclear

Says not tested but provides installation instructions

Recommendation: Add clear experimental status warning

Minor Issues

Inconsistent path styles (Windows vs cross-platform)
Model cache location not documented
Some code blocks missing language identifiers
Minor grammar improvements needed

Recommendations

Before Merge (Required)

Fix FLEURS_BENCHMARK.md reference
Verify API examples
Clarify Whisper status
Add PyTorch comparison disclaimer
Test examples from docs

Nice to Have

Document model cache location
Add reproducibility notes
Clarify Linux NPU status
Platform-agnostic paths

Overall Assessment

Rating: Good with Required Changes

This PR significantly improves documentation. However, accuracy issues must be resolved:

Fix broken link
Verify API accuracy
Clarify Whisper status
Add comparison disclaimers

Test Coverage: N/A (docs only)
Security: No concerns

Files Reviewed:

BENCHMARK.md (224 lines)
BENCHMARK_RESULTS.md (135 lines)
README.md (+155/-44 lines)
ThirdPartyLicenses/README.md (38 lines)
docs/CPP_API.md (259 lines)

Reviewed by: Claude Code Assistant

claude · 2025-11-04T23:26:51Z

Pull Request Review: Documentation Update

This PR significantly improves the project documentation by adding comprehensive guides and expanding the README. Overall, this is a valuable contribution that will help users get started and understand the project better.

✅ Strengths

Comprehensive Documentation Structure
- Added BENCHMARK.md with clear instructions for running benchmarks
- Added BENCHMARK_RESULTS.md with detailed performance metrics
- Added CPP_API.md with clear API documentation and examples
- Added ThirdPartyLicenses/README.md for proper attribution
Improved README.md
- Much clearer project description and value proposition
- Better organization with quick-start examples
- Performance comparison tables are very helpful
- Good troubleshooting section
Good Code Examples
- C++ API examples are clear and practical
- CLI examples cover common use cases
- Language support is well documented

🔍 Issues & Suggestions

BENCHMARK.md

Line 89: Typo in RTFx explanation

- **RTFx = 10 means 1 min audio in 6 seconds**

Should be: "10 seconds" not "6 seconds" (60 seconds / 10 RTFx = 6 seconds is correct, but the wording implies 1 min becomes 6 seconds)

Line 197-198: Inconsistent command format
The wget/tar commands lack proper error handling. Consider:

# Manual download with error handling
wget https://www.openslr.org/resources/12/test-clean.tar.gz || echo "Download failed"
tar -xzf test-clean.tar.gz

BENCHMARK_RESULTS.md

Line 27: Consider adding confidence intervals
The WER and RTFx metrics would benefit from standard deviation or confidence intervals to understand variance.

Line 47: Missing citation year

- FluidAudio v3 (CoreML, multilingual): 2.6% WER

Add reference to when/where this benchmark was published.

Lines 62-84: Language table formatting
The "Ref WER" column lacks context. Add a note explaining what "Ref" means (reference implementation? baseline?).

README.md

Line 99: Footnote formatting inconsistency
The footnotes ¹ and ² use superscript in table but regular text in explanations. Be consistent.

Lines 101-102: Unverified claims

¹ Benchmarked using NeMo parakeet-tdt_ctc-110m as proxy (similar architecture)
² Estimated based on NeMo reference implementations

Using "proxy" and "estimated" weakens the comparison. Either provide actual benchmarks or clearly mark this section as "estimated" in the table header.

Line 131: Untested claim

> **Note:** Linux NPU support has not been tested yet.

This is honest but should include a call-to-action for community testing.

Line 164: Year inconsistency

year={2025}

We're in 2025, but the project may have started earlier. Verify the correct year.

CPP_API.md

Line 21: Missing error handling in example
The basic example doesn't show error handling, which could mislead beginners:

try {
    auto asr = eddy::ParakeetASR::create("parakeet-v2", "NPU");
    auto result = asr->transcribe("audio.wav");
    std::cout << "Text: " << result.text << std::endl;
} catch (const std::exception& e) {
    std::cerr << "Error: " << e.what() << std::endl;
}

Line 247: Broken links
The example references ../examples/cpp but should verify these paths exist in the repository.

ThirdPartyLicenses/README.md

Line 25: Missing license details
libsndfile and libsamplerate licenses are mentioned but full license text files are not included. Consider adding them or linking to them.

🐛 Potential Issues

Platform-specific paths: Many examples use Windows paths (.exe extensions). Add Linux equivalents or use platform-agnostic notation.
Missing validation: No mention of audio file format requirements (sample rate, channels, bit depth) in BENCHMARK.md.
Inconsistent terminology: "RTFx" vs "RTF" used inconsistently throughout documents.

🔒 Security Concerns

No security issues identified. The documentation properly attributes third-party licenses and doesn't expose sensitive information.

⚡ Performance Considerations

The documentation accurately represents performance characteristics. However:

Consider adding memory usage benchmarks
Disk space requirements for model downloads should be mentioned upfront

✅ Test Coverage

This PR is documentation-only, so no test coverage changes. However, consider:

Adding example code to CI/CD to ensure documentation stays accurate
Validating that all command examples actually work

📋 Recommendations

High Priority:
- Fix the RTFx calculation typo (BENCHMARK.md:89)
- Add error handling to all code examples
- Clarify "Ref WER" meaning in tables
- Add missing license files to ThirdPartyLicenses/
Medium Priority:
- Add Linux command equivalents throughout
- Include memory usage benchmarks
- Add standard deviations to performance metrics
- Validate all file paths and links
Low Priority:
- Consider adding a CHANGELOG.md
- Add visual diagrams for architecture
- Include audio sample requirements table

🎯 Overall Assessment

Quality: 8/10 - Excellent documentation effort with comprehensive coverage
Completeness: 9/10 - Covers most user needs very well
Accuracy: 7/10 - Some unverified claims and typos need fixing

This PR significantly improves the project's documentation and makes it much more accessible to users. With the suggested fixes, this will be an excellent addition. Great work!

Recommendation: ✅ Approve with minor revisions

cc: @Alex-Wengg

update readme

a8edcf0

Alex-Wengg marked this pull request as ready for review November 3, 2025 22:42

Alex-Wengg requested a review from BrandonWeng November 4, 2025 00:53

BrandonWeng reviewed Nov 4, 2025

View reviewed changes

update readme.md

743493e

add benchmark.md and shortern readme.md

b3a1465

Alex-Wengg commented Nov 4, 2025

View reviewed changes

BrandonWeng reviewed Nov 4, 2025

View reviewed changes

README.md Outdated Show resolved Hide resolved

BrandonWeng reviewed Nov 4, 2025

View reviewed changes

BrandonWeng approved these changes Nov 4, 2025

View reviewed changes

add detail 3rd license readme.md

8a87c6a

add whisper section

c83ab3c

BrandonWeng reviewed Nov 4, 2025

View reviewed changes

update readme.md , overhaul

b8796c1

add details to cpp_api.md

08c7727

add results

3913aaa

BrandonWeng approved these changes Nov 5, 2025

View reviewed changes

Alex-Wengg merged commit c680b22 into main Nov 5, 2025
2 checks passed


		### Benchmarks

		We also provide [Whisper large-v3-turbo OpenVINO](https://huggingface.co/FluidInference/whisper-large-v3-turbo-fp16-ov-npu) optimized for NPU (16× RTFx). Run `python benchmark_whisper_ov.py` to compare Parakeet vs Whisper performance.


		See [BENCHMARK_RESULTS.md](BENCHMARK_RESULTS.md) for detailed performance metrics.

		## Architecture

update readme #6

update readme #6

Uh oh!

Conversation

Alex-Wengg commented Nov 3, 2025

Uh oh!

claude bot commented Nov 3, 2025

Pull Request Review: README Update

✅ Strengths

1. Excellent Structure & Organization

2. Strong Technical Content

3. User-Focused Documentation

⚠️ Issues & Recommendations

Critical Issues

1. Missing benchmark_librispeech Executable (High Priority)

2. Inconsistent WER Claims (High Priority)

3. Missing Referenced Files (Medium Priority)

4. Documentation Inconsistency (Medium Priority)

Minor Issues

5. Device Option Discrepancy (Low Priority)

6. Python API Marked "Coming Soon" (Low Priority)

7. Emoji Usage (Style)

8. C++ API Example May Be Outdated (Low Priority)

9. Roadmap Completion Status (Low Priority)

📋 Style & Best Practices

Positive

Suggestions

🔒 Security Considerations

📊 Summary

✍️ Recommended Actions

Before Merge (Blockers)

After Merge (Follow-up)

🎯 Final Verdict

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

BrandonWeng commented Nov 4, 2025

Uh oh!

claude bot commented Nov 4, 2025

PR Review: README Update

✅ Strengths

Uh oh!

claude bot commented Nov 4, 2025

⚠️ Issues & Concerns - Part 1: Technical Accuracy

Uh oh!

claude bot commented Nov 4, 2025

⚠️ Issues & Concerns - Part 2: Incomplete/Misleading Content

Uh oh!

claude bot commented Nov 4, 2025

🔧 Suggestions for Improvement

Uh oh!

claude bot commented Nov 4, 2025

🎯 Action Items

📊 Overall Assessment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

claude bot commented Nov 4, 2025

PR Review: Update README

Summary

Strengths

Critical Issues

High Priority

Medium Priority

Security & Testing

Action Items Before Merge

Conclusion

Uh oh!

Uh oh!

Choose a reason for hiding this comment