You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Support various Whisper model with Metal backend (#113)
* fix: rm model_name, executorch #15798
* feat: add model_name param to support various whisper models
* feat: support standard model's FEATURE_SIZE with 80
* docs: update model_name requirement, different FEATURE_SIZE and various model support
| Large V3 Turbo |`openai/whisper-large-v3-turbo`| 809M |**128**| Fast | Default, good balance |
184
+
185
+
### Mel Features Configuration
186
+
187
+
The export scripts automatically configure the correct mel feature size based on the model:
188
+
189
+
-**80 mel features**: Used by all standard models (tiny, base, small, medium, large, large-v2)
190
+
-**128 mel features**: Used only by large-v3 and large-v3-turbo variants
191
+
192
+
**Important:** The preprocessor must match the model's expected feature size, or you'll encounter tensor shape mismatch errors. The export scripts handle this automatically.
193
+
194
+
### Tokenizer Configuration
195
+
196
+
**Important Note:** All Whisper models downloaded from HuggingFace now use the updated tokenizer format where:
197
+
198
+
- Token `50257` = `<|endoftext|>`
199
+
- Token `50258` = `<|startoftranscript|>` (used as `decoder_start_token_id`)
200
+
201
+
The whisper_runner automatically uses `decoder_start_token_id=50258` for all models, so you don't need to worry about tokenizer compatibility when exporting and running any Whisper variant.
0 commit comments