Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New ggml llamacpp file format support #4

Closed
ParisNeo opened this issue May 17, 2023 · 3 comments
Closed

New ggml llamacpp file format support #4

ParisNeo opened this issue May 17, 2023 · 3 comments

Comments

@ParisNeo
Copy link

Hi and thanks for this beautiful work.

Are you planning on supporting the version 2 of the llamacpp files. I want to add the OpenAssistant model to my GPT4ALL-ui and can't find a python binding that supports it.

Here is the bloke's version of the model:
https://huggingface.co/TheBloke/OpenAssistant-SFT-7-Llama-30B-GGML/tree/main

Best regards

@marella
Copy link
Owner

marella commented May 17, 2023

Hi,

Support for the new quantization format was discussed previously here. Since it is a breaking change I haven't updated yet. There are few more features I'm adding to the Python library after which I will update the C backend.

I will look into adding llama.cpp support over the weekend but don't know if I will be to finish it this week. I'm also waiting for the MPT PR to be merged so that I can add that as well.

Thanks for the model link. Do you have a link to a smaller LLaMA-7B model quantized in the latest format that I can use for testing? My machine doesn't have enough RAM to run larger models.

@ParisNeo
Copy link
Author

Of course,
The Bloke has you covered. He has all kinds of models in the new format:
https://huggingface.co/TheBloke

His hugging face models are well organized. Just search for GGML models on his space and you can find anything you want.

The best one is WizardLM-7B-uncensored-GGML, that you can find here:

https://huggingface.co/TheBloke/WizardLM-7B-uncensored-GGML

@marella
Copy link
Owner

marella commented May 21, 2023

This is released in the latest version 0.2.0

It supports LLaMA, MPT models now. It also includes the most recent breaking change ggerganov/llama.cpp#1508

@marella marella closed this as completed May 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants