Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue running local model on a newly installed Windows 11 machine. #50

Open
TheGreyRaven opened this issue Aug 22, 2024 · 5 comments
Open
Labels
bug Something isn't working

Comments

@TheGreyRaven
Copy link

TheGreyRaven commented Aug 22, 2024

Hey!
I have a freshly installed Windows 11 machine with Node 22 installed, when I try to run humanify local <file>.js I get the following crash:

ggml_vulkan: Found 1 Vulkan devices:
Vulkan0: NVIDIA RTX A500 Laptop GPU (NVIDIA) | uma: 0 | fp16: 1 | warp size: 32
[node-llama-cpp] Using this model ("C:\Users\Raven\.humanifyjs\models\Phi-3.1-mini-4k-instruct-Q4_K_M.gguf") to tokenize text with special tokens and then detokenize it resulted in a different text. There might be an issue with the model or the tokenizer implementation. Using this model may not work as intended
ggml_vulkan: Device memory allocation of size 314576896 failed.
ggml_vulkan: vk::Device::allocateMemory: ErrorOutOfDeviceMemory
ggml_gallocr_reserve_n: failed to allocate NVIDIA RTX A500 Laptop GPU buffer of size 314576896
[node-llama-cpp] llama_new_context_with_model: failed to allocate compute buffers
file:///C:/Users/Ravem/AppData/Roaming/npm/node_modules/humanifyjs/node_modules/node-llama-cpp/dist/evaluator/LlamaContext/LlamaContext.js:461
                throw new Error("Failed to create context");
                      ^

Error: Failed to create context
    at LlamaContext._create (file:///C:/Users/Raven/AppData/Roaming/npm/node_modules/humanifyjs/node_modules/node-llama-cpp/dist/evaluator/LlamaContext/LlamaContext.js:461:23)
    at async Object.<anonymous> (file:///C:/Users/Raven/AppData/Roaming/npm/node_modules/humanifyjs/node_modules/node-llama-cpp/dist/evaluator/LlamaModel/LlamaModel.js:274:24)
    at async withLock (file:///C:/Users/Raven/AppData/Roaming/npm/node_modules/humanifyjs/node_modules/lifecycle-utils/dist/withLock.js:36:16)
    at async LlamaModel.createContext (file:///C:/Users/Raven/AppData/Roaming/npm/node_modules/humanifyjs/node_modules/node-llama-cpp/dist/evaluator/LlamaModel/LlamaModel.js:271:16)
    at async llama (file:///C:/Users/Raven/AppData/Roaming/npm/node_modules/humanifyjs/dist/index.mjs:157:19)
    at async Command.<anonymous> (file:///C:/Users/Raven/AppData/Roaming/npm/node_modules/humanifyjs/dist/index.mjs:56702:18)

Node.js v22.6.0

I have also downloaded the local model, any ideas what could be wrong?

@jehna
Copy link
Owner

jehna commented Aug 22, 2024

How much GPU memory does your NVIDIA RTX A500 have? Humanify should run with 7gb GPU memory pretty well, but no guarantees for a system with less memory than that.

You can however use the --disableGpu flag to run the model on your CPU. This may be slower though.

@TheGreyRaven
Copy link
Author

TheGreyRaven commented Aug 22, 2024

It has 4GB of Video memory so that could be the issue, though even if I run humanify local --disableGpu <file> I still get the exact same crash:

ggml_vulkan: Found 1 Vulkan devices:
Vulkan0: NVIDIA RTX A500 Laptop GPU (NVIDIA) | uma: 0 | fp16: 1 | warp size: 32
ggml_vulkan: Device memory allocation of size 314576896 failed.
ggml_vulkan: vk::Device::allocateMemory: ErrorOutOfDeviceMemory
ggml_gallocr_reserve_n: failed to allocate NVIDIA RTX A500 Laptop GPU buffer of size 314576896
[node-llama-cpp] llama_new_context_with_model: failed to allocate compute buffers
.......
.......
.......

@jehna
Copy link
Owner

jehna commented Aug 24, 2024

Seems that you've found a bug! I'll create a patch for that, thank you for sending good debug info

@jehna jehna added the bug Something isn't working label Aug 24, 2024
jehna added a commit that referenced this issue Aug 24, 2024
This made the local inference try to use the GPU even when the user
provided the `--disableGpu` flag.

Fixes #50
@0xdevalias
Copy link

0xdevalias commented Aug 26, 2024

With the bugfix PR merged now, can this issue be closed?

@TheGreyRaven
Copy link
Author

Glad to help out, I did a small workaround for now by running humanify through WSL with installed Nvidia drivers and there everything works great!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants