Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

llama_model_load: error loading model: vk::PhysicalDevice::createDevice: ErrorDeviceLost #9767

Open
BreakShoot opened this issue Oct 6, 2024 · 0 comments
Labels
bug-unconfirmed high severity Used to report high severity bugs in llama.cpp (Malfunctioning hinder important workflow) stale

Comments

@BreakShoot
Copy link

BreakShoot commented Oct 6, 2024

What happened?

How to reproduce

  1. Install latest vulkan release (llama-b3889-bin-win-vulkan-x64)
  2. Install model
  3. Place model in /models
  4. Run given example command
    llama-cli -m models/jina.gguf --prompt "Once upon a time"

Output

build: 3889 (b6d6c528) with MSVC 19.29.30154.0 for x64
main: llama backend init
main: load the model and apply lora adapter, if any
llama_model_loader: loaded meta data with 31 key-value pairs and 196 tensors from models/jina.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = jina-bert-v2
llama_model_loader: - kv   1:                               general.type str              = model
llama_model_loader: - kv   2:                               general.name str              = Jina Bert Implementation
llama_model_loader: - kv   3:                       general.organization str              = Jinaai
llama_model_loader: - kv   4:                         general.size_label str              = 137M
llama_model_loader: - kv   5:                            general.license str              = apache-2.0
llama_model_loader: - kv   6:                               general.tags arr[str,4]       = ["sentence-transformers", "feature-ex...
llama_model_loader: - kv   7:                          general.languages arr[str,1]       = ["en"]
llama_model_loader: - kv   8:                           general.datasets arr[str,1]       = ["allenai/c4"]
llama_model_loader: - kv   9:                   jina-bert-v2.block_count u32              = 12
llama_model_loader: - kv  10:                jina-bert-v2.context_length u32              = 8192
llama_model_loader: - kv  11:              jina-bert-v2.embedding_length u32              = 768
llama_model_loader: - kv  12:           jina-bert-v2.feed_forward_length u32              = 3072
llama_model_loader: - kv  13:          jina-bert-v2.attention.head_count u32              = 12
llama_model_loader: - kv  14:  jina-bert-v2.attention.layer_norm_epsilon f32              = 0.000000
llama_model_loader: - kv  15:                          general.file_type u32              = 17
llama_model_loader: - kv  16:              jina-bert-v2.attention.causal bool             = false
llama_model_loader: - kv  17:                  jina-bert-v2.pooling_type u32              = 1
llama_model_loader: - kv  18:            tokenizer.ggml.token_type_count u32              = 2
llama_model_loader: - kv  19:                       tokenizer.ggml.model str              = bert
llama_model_loader: - kv  20:                         tokenizer.ggml.pre str              = jina-v2-en
llama_model_loader: - kv  21:                      tokenizer.ggml.tokens arr[str,30528]   = ["[PAD]", "[unused0]", "[unused1]", "...
llama_model_loader: - kv  22:                  tokenizer.ggml.token_type arr[i32,30528]   = [3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv  23:            tokenizer.ggml.unknown_token_id u32              = 100
llama_model_loader: - kv  24:          tokenizer.ggml.seperator_token_id u32              = 102
llama_model_loader: - kv  25:            tokenizer.ggml.padding_token_id u32              = 0
llama_model_loader: - kv  26:                tokenizer.ggml.cls_token_id u32              = 101
llama_model_loader: - kv  27:               tokenizer.ggml.mask_token_id u32              = 103
llama_model_loader: - kv  28:               tokenizer.ggml.add_bos_token bool             = true
llama_model_loader: - kv  29:               tokenizer.ggml.add_eos_token bool             = true
llama_model_loader: - kv  30:               general.quantization_version u32              = 2
llama_model_loader: - type  f32:  111 tensors
llama_model_loader: - type q5_K:   72 tensors
llama_model_loader: - type q6_K:   13 tensors
llm_load_vocab: special tokens cache size = 5
llm_load_vocab: token to piece cache size = 0.2032 MB
llm_load_print_meta: format           = GGUF V3 (latest)
llm_load_print_meta: arch             = jina-bert-v2
llm_load_print_meta: vocab type       = WPM
llm_load_print_meta: n_vocab          = 30528
llm_load_print_meta: n_merges         = 0
llm_load_print_meta: vocab_only       = 0
llm_load_print_meta: n_ctx_train      = 8192
llm_load_print_meta: n_embd           = 768
llm_load_print_meta: n_layer          = 12
llm_load_print_meta: n_head           = 12
llm_load_print_meta: n_head_kv        = 12
llm_load_print_meta: n_rot            = 64
llm_load_print_meta: n_swa            = 0
llm_load_print_meta: n_embd_head_k    = 64
llm_load_print_meta: n_embd_head_v    = 64
llm_load_print_meta: n_gqa            = 1
llm_load_print_meta: n_embd_k_gqa     = 768
llm_load_print_meta: n_embd_v_gqa     = 768
llm_load_print_meta: f_norm_eps       = 1.0e-12
llm_load_print_meta: f_norm_rms_eps   = 0.0e+00
llm_load_print_meta: f_clamp_kqv      = 0.0e+00
llm_load_print_meta: f_max_alibi_bias = 8.0e+00
llm_load_print_meta: f_logit_scale    = 0.0e+00
llm_load_print_meta: n_ff             = 3072
llm_load_print_meta: n_expert         = 0
llm_load_print_meta: n_expert_used    = 0
llm_load_print_meta: causal attn      = 0
llm_load_print_meta: pooling type     = 1
llm_load_print_meta: rope type        = -1
llm_load_print_meta: rope scaling     = linear
llm_load_print_meta: freq_base_train  = 10000.0
llm_load_print_meta: freq_scale_train = 1
llm_load_print_meta: n_ctx_orig_yarn  = 8192
llm_load_print_meta: rope_finetuned   = unknown
llm_load_print_meta: ssm_d_conv       = 0
llm_load_print_meta: ssm_d_inner      = 0
llm_load_print_meta: ssm_d_state      = 0
llm_load_print_meta: ssm_dt_rank      = 0
llm_load_print_meta: ssm_dt_b_c_rms   = 0
llm_load_print_meta: model type       = 137M
llm_load_print_meta: model ftype      = Q5_K - Medium
llm_load_print_meta: model params     = 136.78 M
llm_load_print_meta: model size       = 95.16 MiB (5.84 BPW)
llm_load_print_meta: general.name     = Jina Bert Implementation
llm_load_print_meta: UNK token        = 100 '[UNK]'
llm_load_print_meta: SEP token        = 102 '[SEP]'
llm_load_print_meta: PAD token        = 0 '[PAD]'
llm_load_print_meta: CLS token        = 101 '[CLS]'
llm_load_print_meta: MASK token       = 103 '[MASK]'
llm_load_print_meta: LF token         = 0 '[PAD]'
llm_load_print_meta: max token length = 22
ggml_vulkan: Found 1 Vulkan devices:
Vulkan0: AMD Radeon RX 6950 XT (AMD proprietary driver) | uma: 0 | fp16: 1 | warp size: 64
llama_model_load: error loading model: vk::PhysicalDevice::createDevice: ErrorDeviceLost
llama_load_model_from_file: failed to load model
llama_init_from_gpt_params: failed to load model 'models/jina.gguf'
main: error: unable to load model

Name and Version

llama-b3889-bin-win-vulkan-x64

What operating system are you seeing the problem on?

Windows 11

Relevant log output

ggml_vulkan: Found 1 Vulkan devices:
Vulkan0: AMD Radeon RX 6950 XT (AMD proprietary driver) | uma: 0 | fp16: 1 | warp size: 64
llama_model_load: error loading model: vk::PhysicalDevice::createDevice: ErrorDeviceLost
llama_load_model_from_file: failed to load model
llama_init_from_gpt_params: failed to load model 'models/jina.gguf'
main: error: unable to load model
@BreakShoot BreakShoot added bug-unconfirmed high severity Used to report high severity bugs in llama.cpp (Malfunctioning hinder important workflow) labels Oct 6, 2024
@github-actions github-actions bot added the stale label Nov 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug-unconfirmed high severity Used to report high severity bugs in llama.cpp (Malfunctioning hinder important workflow) stale
Projects
None yet
Development

No branches or pull requests

1 participant