Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OSError: [WinError -1073741795] Windows Error 0xc000001d #562

Open
anujcb opened this issue Aug 3, 2023 · 4 comments
Open

OSError: [WinError -1073741795] Windows Error 0xc000001d #562

anujcb opened this issue Aug 3, 2023 · 4 comments
Labels
duplicate This issue or pull request already exists windows A Windoze-specific issue

Comments

@anujcb
Copy link

anujcb commented Aug 3, 2023

Prerequisites

Please answer the following questions for yourself before submitting an issue.

  • [ Yes] I am running the latest code. Development is very rapid so there are no tagged versions as of now.
  • [Yes ] I carefully followed the README.md.
  • [Yes ] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • [Yes ] I reviewed the Discussions, and have a new bug or useful enhancement to share.

Expected Behavior

Running the command
python -m llama_cpp.server --model ./../../llama.cpp/models/v1/7B/ggml-model-q4_0.bin --n_gpu_layers 40 --n_threads 4 --n_ctx 512

should load the model and launch the web server successfully

Current Behavior

ggml_init_cublas: found 1 CUDA devices:
Device 0: NVIDIA GeForce GTX 1070, compute capability 6.1
app.py settings.model ./../../llama.cpp/models/v1/7B/ggml-model-q4_0.bin
app.py settings.n_gpu_layers 40
app.py settings.tensor_split None
app.py settings.rope_freq_base 10000.0
app.py settings.rope_freq_scale 1.0
app.py settings.seed 1337
app.py settings.f16_kv True
app.py settings.use_mlock True
app.py settings.embedding True
app.py settings.logits_all True
app.py settings.n_threads 4
app.py settings.n_batch 512
app.py settings.n_ctx 512
app.py settings.last_n_tokens_size 64
app.py settings.vocab_only False
app.py settings.verbose True
app.py settings.n_gqa None
app.py settings.rms_norm_eps None
inside llama.py {'./../../llama.cpp/models/v1/7B/ggml-model-q4_0.bin'}
inside llama.py before llama_cpp.llama_load_model_from_file {b'./../../llama.cpp/models/v1/7B/ggml-model-q4_0.bin'}
inside llama_cpp.py llama_load_model_from_file lib {<CDLL 'F:\work\scrapalot-research-assistant\llama-cpp-python\llama_cpp\llama.dll', handle 7ffe62220000 at 0x1ce1a36db40>}
inside llama_cpp.py llama_load_model_from_file path_model {b'./../../llama.cpp/models/v1/7B/ggml-model-q4_0.bin'}
inside llama_cpp.py llama_load_model_from_file seed {1337}
inside llama_cpp.py llama_load_model_from_file n_ctx {512}
inside llama_cpp.py llama_load_model_from_file n_batch {512}
inside llama_cpp.py llama_load_model_from_file n_gqa {1}
inside llama_cpp.py llama_load_model_from_file rms_norm_eps {4.999999873689376e-06}
inside llama_cpp.py llama_load_model_from_file n_gpu_layers {40}
inside llama_cpp.py llama_load_model_from_file main_gpu {0}
inside llama_cpp.py llama_load_model_from_file tensor_split - no hash
inside llama_cpp.py llama_load_model_from_file rope_freq_base {10000.0}
inside llama_cpp.py llama_load_model_from_file rope_freq_scale {1.0}
inside llama_cpp.py llama_load_model_from_file progress_callback - no hash
inside llama_cpp.py llama_load_model_from_file progress_callback_user_data {None}
inside llama_cpp.py llama_load_model_from_file low_vram {False}
inside llama_cpp.py llama_load_model_from_file f16_kv {True}
inside llama_cpp.py llama_load_model_from_file logits_all {True}
inside llama_cpp.py llama_load_model_from_file vocab_only {False}
inside llama_cpp.py llama_load_model_from_file use_mmap {True}
inside llama_cpp.py llama_load_model_from_file use_mlock {True}
inside llama_cpp.py llama_load_model_from_file embedding {True}
llama.cpp: loading model from ./../../llama.cpp/models/v1/7B/ggml-model-q4_0.bin
(magic, version) combination: 67676a74, 00000003
llama_model_load_internal: format = ggjt v3 (latest)
llama_model_load_internal: n_vocab = 32000
llama_model_load_internal: n_ctx = 512
llama_model_load_internal: n_embd = 4096
llama_model_load_internal: n_mult = 256
llama_model_load_internal: n_head = 32
llama_model_load_internal: n_head_kv = 32
llama_model_load_internal: n_layer = 32
llama_model_load_internal: n_rot = 128
llama_model_load_internal: n_gqa = 1
llama_model_load_internal: rnorm_eps = 5.0e-06
llama_model_load_internal: n_ff = 11008
llama_model_load_internal: freq_base = 10000.0
llama_model_load_internal: freq_scale = 1
llama_model_load_internal: ftype = 2 (mostly Q4_0)
llama_model_load_internal: model size = 7B
llama_model_load_internal: ggml ctx size = 0.08 MB
Traceback (most recent call last):
File "F:\ProgramData\Anaconda3\envs\scrapalot-research-assistant\lib\runpy.py", line 195, in _run_module_as_main
return _run_code(code, main_globals, None,
File "F:\ProgramData\Anaconda3\envs\scrapalot-research-assistant\lib\runpy.py", line 86, in run_code
exec(code, run_globals)
File "F:\work\scrapalot-research-assistant\llama-cpp-python\llama_cpp\server_main
.py", line 46, in
app = create_app(settings=settings)
File "F:\work\scrapalot-research-assistant\llama-cpp-python\llama_cpp\server\app.py", line 333, in create_app
llama = llama_cpp.Llama(
File "F:\work\scrapalot-research-assistant\llama-cpp-python\llama_cpp\llama.py", line 309, in init
self.model = llama_cpp.llama_load_model_from_file(
File "F:\work\scrapalot-research-assistant\llama-cpp-python\llama_cpp\llama_cpp.py", line 449, in llama_load_model_from_file
return _lib.llama_load_model_from_file(path_model, params)
OSError: [WinError -1073741795] Windows Error 0xc000001d
inside llama.py def del(self) {<llama_cpp.llama.Llama object at 0x000001CE5987B790>}
Exception ignored in: <function Llama.del at 0x000001CE579BE830>
Traceback (most recent call last):
File "F:\work\scrapalot-research-assistant\llama-cpp-python\llama_cpp\llama.py", line 1517, in del
if self.model is not None:
AttributeError: 'Llama' object has no attribute 'model'
(scrapalot-research-assistant) PS F:\work\scrapalot-research-assistant\llama-cpp-python>

Environment and Context

Processor Intel(R) Core(TM) i7-4930K CPU @ 3.40GHz, 3701 Mhz, 6 Core(s), 12 Logical Processor(s)
OS Name Microsoft Windows 10 Pro
Nvidia toolkit 11.8
VS 22 community edition
Python 3.11.4

PS F:\work\llama.cpp> nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:41:10_Pacific_Daylight_Time_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0

PS F:\work\llama.cpp\build> cmake .. -DLLAMA_CUBLAS=ON
-- Selecting Windows SDK version 10.0.22000.0 to target Windows 10.0.19045.
-- cuBLAS found
-- Using CUDA architectures: 52;61
-- CMAKE_SYSTEM_PROCESSOR: AMD64
-- x86 detected
-- Configuring done (4.8s)
-- Generating done (0.6s)
-- Build files have been written to: F:/work/llama.cpp

PS F:\work\llama.cpp> cmake --build . --config Release
MSBuild version 17.6.3+07e294721 for .NET Framework

Failure Information (for bugs)

Please help provide information about the failure if this is a bug. If it is not a bug, please remove the rest of this template.

Steps to Reproduce

Please provide detailed steps for reproducing the issue. We are not sitting in front of your screen, so the more detail the better.

  1. download and run abetlen/llama-cpp-python
  2. execute python -m llama_cpp.server --model ./../../llama.cpp/models/v1/7B/ggml-model-q4_0.bin --n_gpu_layers 40 --n_threads 4 --n_ctx 512
@anujcb
Copy link
Author

anujcb commented Aug 3, 2023

Figured out the fix

  1. Build DLL from ggerganov/llama.cpp - use cmake .. -DBUILD_SHARED_LIBS=ON
  2. copy the dll to
    image
  3. comment out self.params.use_mlock = use_mlock in llama.py
    image
  4. Fix the pydantic warning by replacing model_alias with alias
    image

@gjmulder gjmulder changed the title AttributeError: 'Llama' object has no attribute 'model' OSError: [WinError -1073741795] Windows Error 0xc000001d Aug 4, 2023
@gjmulder
Copy link
Contributor

gjmulder commented Aug 4, 2023

#53

@gjmulder gjmulder added duplicate This issue or pull request already exists windows A Windoze-specific issue labels Aug 4, 2023
@ahlwjnj
Copy link

ahlwjnj commented Aug 5, 2023

Figured out the fix

  1. Build DLL from ggerganov/llama.cpp - use cmake .. -DBUILD_SHARED_LIBS=ON
  2. copy the dll to
    image
  3. comment out self.params.use_mlock = use_mlock in llama.py
    image
  4. Fix the pydantic warning by replacing model_alias with alias
    image

Could you please share the detail steps about generating llama.dll file ? Cause I can not install make but have cmake.

@anujcb
Copy link
Author

anujcb commented Aug 6, 2023

Figured out the fix

  1. Build DLL from ggerganov/llama.cpp - use cmake .. -DBUILD_SHARED_LIBS=ON
  2. copy the dll to
    image
  3. comment out self.params.use_mlock = use_mlock in llama.py
    image
  4. Fix the pydantic warning by replacing model_alias with alias
    image

Could you please share the detail steps about generating llama.dll file ? Cause I can not install make but have cmake.
RUn Cmake like below
cmake .. -DBUILD_SHARED_LIBS=ON

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate This issue or pull request already exists windows A Windoze-specific issue
Projects
None yet
Development

No branches or pull requests

3 participants