-
Notifications
You must be signed in to change notification settings - Fork 0
add fleur test #5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Code Review: FLEURS Multilingual ASR BenchmarkThank you for this comprehensive benchmark implementation! This is a well-structured addition that brings multilingual testing capabilities to the project. Here's my detailed review: ✅ StrengthsCode Quality & Structure
Implementation Quality
🔍 Issues Found🔴 Critical Issues1. Security: Unvalidated Network Downloads (benchmark_fleurs.py:202-250)Severity: HIGH The code downloads files from HuggingFace without proper validation:
Recommendation: # Add checksum validation
EXPECTED_CHECKSUMS = {
"en_us.trans.txt": "sha256:...",
# ...
}
# Verify after download
import hashlib
def verify_file(path, expected_hash):
sha256 = hashlib.sha256()
with open(path, 'rb') as f:
for chunk in iter(lambda: f.read(8192), b''):
sha256.update(chunk)
if sha256.hexdigest() != expected_hash:
raise ValueError(f"Checksum mismatch for {path}")2. Bug: Naive Audio Resampling (benchmark_fleurs.py:382-391)Severity: MEDIUM Linear interpolation for audio resampling introduces artifacts: # This is not production-quality resampling
audio_data = np.interp(
np.linspace(0, len(audio_data), new_length),
np.arange(len(audio_data)),
audio_data
)Impact: May degrade benchmark accuracy for non-16kHz samples Recommendation: Use proper resampling: # Add to dependencies
import librosa
# or
from scipy.signal import resample_poly
# Proper resampling
if sample_rate != 16000:
audio_data = librosa.resample(audio_data, orig_sr=sample_rate, target_sr=16000)3. Bug: Missing Dependency DeclarationThe Python script imports 🟡 Medium Issues4. Performance: Inefficient Text Processing (C++: benchmark_fleurs.cpp:94-115)The // Consider using whisper-normalizer or a regex-based approach
// Current approach is O(n) but with many branches5. Error Handling: Silent Failures (benchmark_fleurs.py:461-463)Errors during processing are caught and counted as "skipped" but:
Recommendation: except Exception as e:
error_log = self.cache_dir / "errors.log"
with open(error_log, 'a') as f:
f.write(f"{sample.sample_id}: {e}\n")
if self.debug:
import traceback
traceback.print_exc()
skipped_count += 16. Code Quality: Hardcoded Model Version (benchmark_fleurs.py:825)blank_token_id=8192, # Parakeet v3This is hardcoded for v3 but the docstring says "Parakeet v2". Ensure consistency or make configurable. 🟢 Minor Issues7. Path Inconsistency (benchmark_fleurs.py:142-145)Windows uses 8. Documentation: Missing RequirementsThe script shows example usage but doesn't document:
9. Code Smell: Unused Import (benchmark_fleurs.py:22)import time # Not used in the code10. CMakeLists.txt: Missing Comments (examples/cpp/CMakeLists.txt:13-14)The new executable is added without a comment explaining its purpose (while others have context). 🎯 Performance ConsiderationsGood:
Suggestions:
🔒 Security AssessmentConcerns:
Good:
🧪 Test Coverage
|
BrandonWeng
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🔥
Uh oh!
There was an error while loading. Please reload this page.