Open
Description
Currently the best results we can get with whisper.cpp is with Cuda (Nvidia) or CoreML (macOS).
On Windows there's only OpenBlas and it works slow, maybe 2 times of the duration of the audio (amd ryzen 5 4500u, medium model).
When using ctranslate2 on the same machine it works 2-3 times faster than the audio duration on CPU only!
Since recently whisper.cpp removed support for OpenCL, I think that it's important having good alternative to Windows users with Intel / AMD CPUs / TPUs.
There's few different options that can be added:
oneDNN-ExecutionProvider.html
DirectML-ExecutionProvider.html
In addition ctranslate2 uses ruy
Related: ggml-org/ggml#406 (comment)
Metadata
Metadata
Assignees
Labels
No labels