Skip to content

Non-BLAS works, BLAS crashes. CPU only. #3654

@pegleGrot

Description

@pegleGrot

Using the v1.8.3 precompiled binaries, the BLAS variant crashes.
It does show the usual console output but crashes after the "main: processing ..." line:

Details whisper_init_from_file_with_params_no_state: loading model from 'models\ggml-small-q5_1.bin' whisper_init_with_params_no_state: use gpu = 1 whisper_init_with_params_no_state: flash attn = 1 whisper_init_with_params_no_state: gpu_device = 0 whisper_init_with_params_no_state: dtw = 0 whisper_init_with_params_no_state: devices = 2 whisper_init_with_params_no_state: backends = 2 whisper_model_load: loading model whisper_model_load: n_vocab = 51865 whisper_model_load: n_audio_ctx = 1500 whisper_model_load: n_audio_state = 768 whisper_model_load: n_audio_head = 12 whisper_model_load: n_audio_layer = 12 whisper_model_load: n_text_ctx = 448 whisper_model_load: n_text_state = 768 whisper_model_load: n_text_head = 12 whisper_model_load: n_text_layer = 12 whisper_model_load: n_mels = 80 whisper_model_load: ftype = 9 whisper_model_load: qntvr = 1 whisper_model_load: type = 3 (small) whisper_model_load: adding 1608 extra tokens whisper_model_load: n_langs = 99 whisper_model_load: CPU total size = 189.49 MB whisper_model_load: model size = 189.49 MB whisper_backend_init_gpu: device 0: BLAS (type: 3) whisper_backend_init_gpu: device 1: CPU (type: 0) whisper_backend_init_gpu: no GPU found whisper_backend_init: using BLAS backend whisper_init_state: kv self size = 18.87 MB whisper_init_state: kv cross size = 56.62 MB whisper_init_state: kv pad size = 4.72 MB whisper_init_state: compute buffer (conv) = 22.42 MB whisper_init_state: compute buffer (encode) = 33.85 MB whisper_init_state: compute buffer (cross) = 6.20 MB whisper_init_state: compute buffer (decode) = 97.28 MB

system_info: n_threads = 4 / 4 | WHISPER : COREML = 0 | OPENVINO = 0 | CPU : SSE3 = 1 | SSSE3 = 1 | AVX = 1 | AVX2 = 1 | F16C = 1 | FMA = 1 | OPENMP = 1 | REPACK = 1 |

main: processing 'a.flac' (1220376 samples, 76.3 sec), 4 threads, 1 processors, 5 beams + best of 5, lang = he, task = transcribe, timestamps = 1 ...

I tried also using newer OpenBLAS DLLs, v0.3.31, both x64 and x64-64 (no idea what's that supposed to be).

Running on Intel 6th gen (Skylake), no dGPU, Windows 8.1.

Any idea what it might be, or how to analyze further?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions