-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Open
Description
Using the v1.8.3 precompiled binaries, the BLAS variant crashes.
It does show the usual console output but crashes after the "main: processing ..." line:
Details
whisper_init_from_file_with_params_no_state: loading model from 'models\ggml-small-q5_1.bin'
whisper_init_with_params_no_state: use gpu = 1
whisper_init_with_params_no_state: flash attn = 1
whisper_init_with_params_no_state: gpu_device = 0
whisper_init_with_params_no_state: dtw = 0
whisper_init_with_params_no_state: devices = 2
whisper_init_with_params_no_state: backends = 2
whisper_model_load: loading model
whisper_model_load: n_vocab = 51865
whisper_model_load: n_audio_ctx = 1500
whisper_model_load: n_audio_state = 768
whisper_model_load: n_audio_head = 12
whisper_model_load: n_audio_layer = 12
whisper_model_load: n_text_ctx = 448
whisper_model_load: n_text_state = 768
whisper_model_load: n_text_head = 12
whisper_model_load: n_text_layer = 12
whisper_model_load: n_mels = 80
whisper_model_load: ftype = 9
whisper_model_load: qntvr = 1
whisper_model_load: type = 3 (small)
whisper_model_load: adding 1608 extra tokens
whisper_model_load: n_langs = 99
whisper_model_load: CPU total size = 189.49 MB
whisper_model_load: model size = 189.49 MB
whisper_backend_init_gpu: device 0: BLAS (type: 3)
whisper_backend_init_gpu: device 1: CPU (type: 0)
whisper_backend_init_gpu: no GPU found
whisper_backend_init: using BLAS backend
whisper_init_state: kv self size = 18.87 MB
whisper_init_state: kv cross size = 56.62 MB
whisper_init_state: kv pad size = 4.72 MB
whisper_init_state: compute buffer (conv) = 22.42 MB
whisper_init_state: compute buffer (encode) = 33.85 MB
whisper_init_state: compute buffer (cross) = 6.20 MB
whisper_init_state: compute buffer (decode) = 97.28 MB
system_info: n_threads = 4 / 4 | WHISPER : COREML = 0 | OPENVINO = 0 | CPU : SSE3 = 1 | SSSE3 = 1 | AVX = 1 | AVX2 = 1 | F16C = 1 | FMA = 1 | OPENMP = 1 | REPACK = 1 |
main: processing 'a.flac' (1220376 samples, 76.3 sec), 4 threads, 1 processors, 5 beams + best of 5, lang = he, task = transcribe, timestamps = 1 ...
I tried also using newer OpenBLAS DLLs, v0.3.31, both x64 and x64-64 (no idea what's that supposed to be).
Running on Intel 6th gen (Skylake), no dGPU, Windows 8.1.
Any idea what it might be, or how to analyze further?
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels