You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
I reviewed the Discussions, and have a new and useful enhancement to share.
Feature Description
When I convert the bnb model to gguf, I report ValueError: Can not map tensor 'Model.layers.0.mlp.down_proj.weight.absmax'. Looking at the documentation, the current version of llama.cpp does not support the conversion of the quantized model to gguf. In production, there are many requirements to convert the quantified models (AWQ,BNB,GPTQ) to gguf for ollama deployment, so hopefully the authors can add this capability
Motivation
Conversion of quantized models to gguf is not currently supported. After the function is implemented, users can convert any quantized model to gguf and deploy it through ollama
Possible Implementation
No response
The text was updated successfully, but these errors were encountered:
I am not sure I am getting the use-case for this? Why would you want to convert an already quantized model to GGUF? Llama.cpp doesn't support AWQ, BNB and GPTQ, so to create a GUFF you would have to requantize to a supported quantization method, losing more information on top what was already lost the first time the model was quantized to AWQ, BNB or GPTQ.
In what scenarios is the FP16 model not available that you can simply use that model to create a GGUF version?
Prerequisites
Feature Description
When I convert the bnb model to gguf, I report ValueError: Can not map tensor 'Model.layers.0.mlp.down_proj.weight.absmax'. Looking at the documentation, the current version of llama.cpp does not support the conversion of the quantized model to gguf. In production, there are many requirements to convert the quantified models (AWQ,BNB,GPTQ) to gguf for ollama deployment, so hopefully the authors can add this capability
Motivation
Conversion of quantized models to gguf is not currently supported. After the function is implemented, users can convert any quantized model to gguf and deploy it through ollama
Possible Implementation
No response
The text was updated successfully, but these errors were encountered: