Skip to content

Stablelm-2-1_6b-chat config extracted from GGUF file differs from source model config #34426

Closed
@Isotr0py

Description

@Isotr0py

System Info

  • transformers version: 4.46.0
  • Platform: Linux-6.1.85+-x86_64-with-glibc2.35
  • Python version: 3.10.12
  • Huggingface_hub version: 0.24.7
  • Safetensors version: 0.4.5
  • Accelerate version: 0.34.2
  • Accelerate config: not found
  • PyTorch version (GPU?): 2.5.0+cu121 (False)
  • Tensorflow version (GPU?): 2.17.0 (False)
  • Flax version (CPU?/GPU?/TPU?): 0.8.5 (cpu)
  • Jax version: 0.4.33
  • JaxLib version: 0.4.33
  • Using distributed or parallel set-up in script?:

Who can help?

@SunMarc
Also cc @VladOS95-cyber since you added GGUF support for StableLM :)

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

from transformers import AutoConfig

config_hf = AutoConfig.from_pretrained("stabilityai/stablelm-2-1_6b-chat")
config_gguf = AutoConfig.from_pretrained("Crataco/stablelm-2-1_6b-chat-imatrix-GGUF", gguf_file="stablelm-2-1_6b-chat.IQ4_XS.imx.gguf")
print(config_hf)
print(config_gguf)

Outputs

StableLmConfig {
  ...
  "use_qkv_bias": true,
  "vocab_size": 100352
}

StableLmConfig {
  ...
  "use_qkv_bias": false,
  "vocab_size": 100352
}

Expected behavior

The stabilityai/stablelm-2-1_6b-chat" model has use_qkv_bias=True. However, the config extracted from stablelm-2-1_6b-chat GGUF file has use_qkv_bias=False, causing model failed to initialize with qkv_proj bias.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions