-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When loading the model I get the following error: #17
Comments
What model are you trying to load? This error is indicative of an incompatible model. |
I'm loading this model ----> TheBloke/sqlcoder-GGUF |
What version of llama-cpp-python are you using? |
llama_cpp_python 0.2.11+cu117 |
Yeah, just finished downloading and got the same error. There may be something wrong with the model. |
Could simply be that StarCoder models aren't supported with CUDA? Not sure. |
This does seem to be the case: ggerganov/llama.cpp#3187 (comment) |
Thank you for your help. Where can I see supported models for CUDA? |
Only thing I can find so far is this in the source code: The
|
llm_load_tensors: ggml ctx size = 0.16 MB
llm_load_tensors: using CUDA for GPU acceleration
llm_load_tensors: mem required = 9363.40 MB
llm_load_tensors: offloading 6 repeating layers to GPU
llm_load_tensors: offloaded 6/43 layers to GPU
llm_load_tensors: VRAM used: 1637.37 MB
.................................................................................GGML_ASSERT: D:\a\llama-cpp-python-cuBLAS-wheels\llama-cpp-python-cuBLAS-wheels\vendor\llama.cpp\ggml-cuda.cu:5925: false
The text was updated successfully, but these errors were encountered: