tag:github.com,2008:https://github.com/Otosaku/NeMoFeatureExtractor-iOS/releases
Release notes from NeMoFeatureExtractor-iOS
2026-02-06T09:16:47Z
tag:github.com,2008:Repository/1151224479/1.0.5
2026-02-06T09:32:04Z
v1.0.5
<p>v1.0.5: cleanup debug logging, fix resource loading</p>
<p>- Simplify mel_filterbank.bin loading (supports both .copy and .process)
<br />- Remove debug print statements
<br />- Update README with correct org URL and version</p>
<p>🤖 Generated with [Claude Code](<a href="https://claude.com/claude-code">https://claude.com/claude-code</a>)</p>
<p>Co-Authored-By: Claude <noreply@anthropic.com></p>
otosaku-ai
tag:github.com,2008:Repository/1151224479/1.0.4
2026-02-06T09:01:26Z
1.0.4: debug: add logging for resource loading
<p>🤖 Generated with <a href="https://claude.com/claude-code" rel="nofollow">Claude Code</a></p>
<p>Co-Authored-By: Claude <a href="mailto:noreply@anthropic.com">noreply@anthropic.com</a></p>
otosaku-ai
tag:github.com,2008:Repository/1151224479/1.0.3
2026-02-06T08:49:19Z
1.0.3: fix: use .process instead of .copy for resources
<p>Fixes code signing issues on iOS simulator where .copy creates<br>
an unrecognized bundle format.</p>
<p>🤖 Generated with <a href="https://claude.com/claude-code" rel="nofollow">Claude Code</a></p>
<p>Co-Authored-By: Claude <a href="mailto:noreply@anthropic.com">noreply@anthropic.com</a></p>
otosaku-ai
tag:github.com,2008:Repository/1151224479/v1.0.2
2026-02-06T08:27:29Z
v1.0.2: Fix frame count formula based on NeMo source code
<p>Analyzed NeMo's FilterbankFeatures.get_seq_len() and forward() methods.</p>
<p>Correct formula:</p>
<ol>
<li>STFT frames = 1 + audio_length // hop_length (torch.stft with center=True)</li>
<li>Output frames = round_up(stft_frames, pad_to) if pad_to > 0</li>
</ol>
<p>Previous formula (n + windowSize) / hopLength was incorrect.</p>
<p>Now all model configs match NeMo exactly:</p>
<ul>
<li>VAD: [80, 52] == [80, 52] (pad_to=2)</li>
<li>Speaker: [80, 64] == [80, 64] (pad_to=16)</li>
<li>ASR: [80, 51] == [80, 51] (pad_to=0)</li>
</ul>
<p>Also improved tests:</p>
<ul>
<li>Added strict frame count equality checks</li>
<li>Updated tolerances to 1e-4 max diff, 1e-5 avg diff</li>
</ul>
<p>🤖 Generated with <a href="https://claude.com/claude-code" rel="nofollow">Claude Code</a></p>
<p>Co-Authored-By: Claude <a href="mailto:noreply@anthropic.com">noreply@anthropic.com</a></p>
otosaku-ai
tag:github.com,2008:Repository/1151224479/v1.0.1
2026-02-06T08:18:00Z
v1.0.1: Fix STFT frame count formula to match NeMo exactly
<p>Changed frame count calculation for center padding from:<br>
sampleCount / hopLength + 1 (gave 51 frames)<br>
to:<br>
(sampleCount + windowSize) / hopLength (gives 52 frames)</p>
<p>This matches NeMo's AudioToMelSpectrogramPreprocessor output exactly.<br>
Also updated output frame calculation to use nFrames instead of validFrames.</p>
<p>Test results now show exact shape match:</p>
<ul>
<li>Swift mel shape: [80, 52]</li>
<li>NeMo mel shape: [80, 52]</li>
<li>Max diff: 5.4e-05</li>
</ul>
<p>🤖 Generated with <a href="https://claude.com/claude-code" rel="nofollow">Claude Code</a></p>
<p>Co-Authored-By: Claude <a href="mailto:noreply@anthropic.com">noreply@anthropic.com</a></p>
otosaku-ai
tag:github.com,2008:Repository/1151224479/v1.0.0
2026-02-06T07:47:49Z
Initial release v1.0.0 - NeMoFeatureExtractor for iOS
<p>Swift library for extracting mel-spectrogram features compatible with<br>
NVIDIA NeMo speech models. Features:</p>
<ul>
<li>Exact compatibility with NeMo's feature extraction pipeline</li>
<li>Supports VAD (MarbleNet), Speaker (TitaNet), and ASR models</li>
<li>High performance using Apple's Accelerate framework (vDSP)</li>
<li>Pre-computed mel filterbank from NeMo for maximum accuracy</li>
<li>Output as [[Float]] or MLMultiArray for CoreML inference</li>
<li>Tested against NeMo Python reference (max diff < 6e-05)</li>
</ul>
<p>🤖 Generated with <a href="https://claude.com/claude-code" rel="nofollow">Claude Code</a></p>
<p>Co-Authored-By: Claude <a href="mailto:noreply@anthropic.com">noreply@anthropic.com</a></p>
otosaku-ai