Replies: 1 comment
-
Hello llama.cpp team, While I am not an expert in model quantization or llama.cpp internals, I've been looking into issue #11768 concerning the quantization of Llama-Breeze2-8B-Instruct. With the help of AI, I've tried to analyze the problem and suggest some possible troubleshooting steps. I hope this analysis can be of some assistance to the team in resolving this. Problem Summary:The issue at hand is the failure to quantize the Llama-Breeze2-8B-Instruct model (Hugging Face link: https://huggingface.co/MediaTek-Research/Llama-Breeze2-8B-Instruct) using llama.cpp, as discussed in issue #11768 (https://github.com/ggml-org/llama.cpp/discussions/11768). It appears the model architecture is defined by InternVLChatModel, as evidenced in the modeling_internvl_chat.py file (https://huggingface.co/MediaTek-Research/Llama-Breeze2-8B-Instruct/blob/main/modeling_internvl_chat.py). The use of InternVLChatModel suggests that this model might not be based on a standard Llama architecture and could be the root cause of the quantization incompatibility with llama.cpp. Standard llama.cpp quantization tools might not be designed to handle this specific architecture. Thank you for your time and effort in addressing this issue. |
Beta Was this translation helpful? Give feedback.
-
Hello
llama.cpp
community,I am proposing an enhancement to the
llama.cpp
project to add support for theInternVLChatModel
architecture. During the model conversion process using theconvert_hf_to_gguf.py
script, I encountered an error indicating that this specific model architecture is not yet supported.Background:
InternVLChatModel
Motivation:
InternVLChatModel
would greatly enhance the flexibility and applicability ofllama.cpp
in handling a wider range of state-of-the-art models.Request:
InternVLChatModel
, please share your thoughts.Additional Information:
Thank you for considering this proposal. I look forward to collaborating with the community to explore the feasibility of this enhancement.
Beta Was this translation helpful? Give feedback.
All reactions