Issue running local model on a newly installed Windows 11 machine. #50

TheGreyRaven · 2024-08-22T06:58:44Z

Hey!
I have a freshly installed Windows 11 machine with Node 22 installed, when I try to run humanify local <file>.js I get the following crash:

ggml_vulkan: Found 1 Vulkan devices:
Vulkan0: NVIDIA RTX A500 Laptop GPU (NVIDIA) | uma: 0 | fp16: 1 | warp size: 32
[node-llama-cpp] Using this model ("C:\Users\Raven\.humanifyjs\models\Phi-3.1-mini-4k-instruct-Q4_K_M.gguf") to tokenize text with special tokens and then detokenize it resulted in a different text. There might be an issue with the model or the tokenizer implementation. Using this model may not work as intended
ggml_vulkan: Device memory allocation of size 314576896 failed.
ggml_vulkan: vk::Device::allocateMemory: ErrorOutOfDeviceMemory
ggml_gallocr_reserve_n: failed to allocate NVIDIA RTX A500 Laptop GPU buffer of size 314576896
[node-llama-cpp] llama_new_context_with_model: failed to allocate compute buffers
file:///C:/Users/Ravem/AppData/Roaming/npm/node_modules/humanifyjs/node_modules/node-llama-cpp/dist/evaluator/LlamaContext/LlamaContext.js:461
                throw new Error("Failed to create context");
                      ^

Error: Failed to create context
    at LlamaContext._create (file:///C:/Users/Raven/AppData/Roaming/npm/node_modules/humanifyjs/node_modules/node-llama-cpp/dist/evaluator/LlamaContext/LlamaContext.js:461:23)
    at async Object.<anonymous> (file:///C:/Users/Raven/AppData/Roaming/npm/node_modules/humanifyjs/node_modules/node-llama-cpp/dist/evaluator/LlamaModel/LlamaModel.js:274:24)
    at async withLock (file:///C:/Users/Raven/AppData/Roaming/npm/node_modules/humanifyjs/node_modules/lifecycle-utils/dist/withLock.js:36:16)
    at async LlamaModel.createContext (file:///C:/Users/Raven/AppData/Roaming/npm/node_modules/humanifyjs/node_modules/node-llama-cpp/dist/evaluator/LlamaModel/LlamaModel.js:271:16)
    at async llama (file:///C:/Users/Raven/AppData/Roaming/npm/node_modules/humanifyjs/dist/index.mjs:157:19)
    at async Command.<anonymous> (file:///C:/Users/Raven/AppData/Roaming/npm/node_modules/humanifyjs/dist/index.mjs:56702:18)

Node.js v22.6.0

I have also downloaded the local model, any ideas what could be wrong?

The text was updated successfully, but these errors were encountered:

jehna · 2024-08-22T07:03:58Z

How much GPU memory does your NVIDIA RTX A500 have? Humanify should run with 7gb GPU memory pretty well, but no guarantees for a system with less memory than that.

You can however use the --disableGpu flag to run the model on your CPU. This may be slower though.

TheGreyRaven · 2024-08-22T08:00:21Z

It has 4GB of Video memory so that could be the issue, though even if I run humanify local --disableGpu <file> I still get the exact same crash:

ggml_vulkan: Found 1 Vulkan devices:
Vulkan0: NVIDIA RTX A500 Laptop GPU (NVIDIA) | uma: 0 | fp16: 1 | warp size: 32
ggml_vulkan: Device memory allocation of size 314576896 failed.
ggml_vulkan: vk::Device::allocateMemory: ErrorOutOfDeviceMemory
ggml_gallocr_reserve_n: failed to allocate NVIDIA RTX A500 Laptop GPU buffer of size 314576896
[node-llama-cpp] llama_new_context_with_model: failed to allocate compute buffers
.......
.......
.......

jehna · 2024-08-24T07:54:15Z

Seems that you've found a bug! I'll create a patch for that, thank you for sending good debug info

This made the local inference try to use the GPU even when the user provided the `--disableGpu` flag. Fixes #50

0xdevalias · 2024-08-26T02:27:26Z

With the bugfix PR merged now, can this issue be closed?

--disableGpu not working #55
- Fix disabling GPU use #56

TheGreyRaven · 2024-08-26T11:15:54Z

Glad to help out, I did a small workaround for now by running humanify through WSL with installed Nvidia drivers and there everything works great!

0xdevalias mentioned this issue Aug 23, 2024

Better error handling/user guidance for missing local models #53

Open

jehna added the bug Something isn't working label Aug 24, 2024

jehna added a commit that referenced this issue Aug 24, 2024

Fix typo: disableGPU != disableGpu

0c1abf2

This made the local inference try to use the GPU even when the user provided the `--disableGpu` flag. Fixes #50

jehna mentioned this issue Aug 24, 2024

--disableGpu not working #55

Closed

0xdevalias mentioned this issue Sep 30, 2024

install problem #135

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue running local model on a newly installed Windows 11 machine. #50

Issue running local model on a newly installed Windows 11 machine. #50

TheGreyRaven commented Aug 22, 2024 •

edited

Loading

jehna commented Aug 22, 2024

TheGreyRaven commented Aug 22, 2024 •

edited

Loading

jehna commented Aug 24, 2024

0xdevalias commented Aug 26, 2024 •

edited

Loading

TheGreyRaven commented Aug 26, 2024

Issue running local model on a newly installed Windows 11 machine. #50

Issue running local model on a newly installed Windows 11 machine. #50

Comments

TheGreyRaven commented Aug 22, 2024 • edited Loading

jehna commented Aug 22, 2024

TheGreyRaven commented Aug 22, 2024 • edited Loading

jehna commented Aug 24, 2024

0xdevalias commented Aug 26, 2024 • edited Loading

TheGreyRaven commented Aug 26, 2024

TheGreyRaven commented Aug 22, 2024 •

edited

Loading

TheGreyRaven commented Aug 22, 2024 •

edited

Loading

0xdevalias commented Aug 26, 2024 •

edited

Loading