Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault when compiling with make LLAMA_CUBLAS=1 #3054

Closed
4 tasks done
atopheim opened this issue Sep 7, 2023 · 12 comments
Closed
4 tasks done

Segfault when compiling with make LLAMA_CUBLAS=1 #3054

atopheim opened this issue Sep 7, 2023 · 12 comments

Comments

@atopheim
Copy link

atopheim commented Sep 7, 2023

Prerequisites

Please answer the following questions for yourself before submitting an issue.

  • I am running the latest code. Development is very rapid so there are no tagged versions as of now.
  • I carefully followed the README.md.
  • I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • I reviewed the Discussions, and have a new bug or useful enhancement to share.

Expected Behavior

I am trying to run nvidia-accelerated model inference since I have a 3080Ti laptop. So I built using make LLAMA_CUBLAS=1

Current Behavior

make LLAMA_CUBLAS=1
gdb ./main
(No debugging symbols found in ./main)
(gdb) set args -m ./models/34B/codellama-34b.Q5_K_M.gguf -p "The following are three of my mostly used scripts for automating parts of my day:"
(gdb) run
Starting program: /home/torbjorn/Documents/Github/atopheim/llama.cpp/main -m ./models/34B/codellama-34b.Q5_K_M.gguf -p "The following are three of my mostly used scripts for automating parts of my day:"
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
0x00007fffceb5d661 in ?? () from /usr/lib/libc.so.6
(gdb) backtrace
#0 0x00007fffceb5d661 in ?? () from /usr/lib/libc.so.6
1 0x00007fffcef5a8a4 in std::char_traits::copy (__n=, __s2=0x7fffffffdd0e "./models/34B/codellama-34b.Q5_K_M.gguf", __s1=0x0) at /usr/src/debug/gcc/gcc-build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/char_traits.h:445
2 std::__cxx11::basic_string<char, std::char_traits, std::allocator >::_S_copy (__n=, __s=0x7fffffffdd0e "./models/34B/codellama-34b.Q5_K_M.gguf", __d=0x0)
at /usr/src/debug/gcc/gcc-build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/basic_string.h:420
3 std::__cxx11::basic_string<char, std::char_traits, std::allocator >::_S_copy (__n=, __s=0x7fffffffdd0e "./models/34B/codellama-34b.Q5_K_M.gguf", __d=0x0)
at /usr/src/debug/gcc/gcc-build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/basic_string.h:415
4 std::__cxx11::basic_string<char, std::char_traits, std::allocator >::_M_replace (this=0x7fffffffc1b0, __pos=, __len1=, __s=0x7fffffffdd0e "./models/34B/codellama-34b.Q5_K_M.gguf",
__len2=) at /usr/src/debug/gcc/gcc-build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/basic_string.tcc:537
5 0x00005555555f892c in gpt_params_parse(int, char**, gpt_params&) ()
6 0x0000555555564b0b in main ()

Environment and Context

Please provide detailed information about your computer setup. This is important in case the issue is not reproducible except for under certain specific conditions.

  • Physical (or virtual) hardware you are using, e.g. for Linux:

$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 48 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 16
On-line CPU(s) list: 0-15
Vendor ID: AuthenticAMD
Model name: AMD Ryzen 9 5900HX with Radeon Graphics
CPU family: 25
Model: 80
Thread(s) per core: 2
Core(s) per socket: 8
Socket(s): 1
Stepping: 0
Frequency boost: enabled
CPU(s) scaling MHz: 70%
CPU max MHz: 4888,7690
CPU min MHz: 1200,0000
BogoMIPS: 6590,14
Virtualization features:
Virtualization: AMD-V
Caches (sum of all):
L1d: 256 KiB (8 instances)
L1i: 256 KiB (8 instances)
L2: 4 MiB (8 instances)
L3: 16 MiB (1 instance)
NUMA:
NUMA node(s): 1
NUMA node0 CPU(s): 0-15

$ uname -a
6.1.44-1-MANJARO (hashtag)1 SMP PREEMPT_DYNAMIC Wed Aug 9 09:02:26 UTC 2023 x86_64 GNU/Linux

  • SDK version, e.g. for Linux:
$ python3 --version
Python 3.11.3
$ make --version
GNU Make 4.4.1
Built for x86_64-pc-linux-gnu
Copyright (C) 1988-2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <https://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
$ g++ --version
g++ (GCC) 12.3.0
Copyright (C) 2022 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
@atopheim
Copy link
Author

atopheim commented Sep 7, 2023

I had better luck with cmake

@JohannesGaessler
Copy link
Collaborator

JohannesGaessler commented Sep 7, 2023

Compiling with LLAMA_DEBUG=1 adds debugging symbols and will allow GDB to determine which exact lines are causing the segfault; would be useful information for devs.

@KerfuffleV2
Copy link
Collaborator

What JG said.

This is a pretty weird error, it looks like it crashed parsing the commandline arguments but there's nothing unusual about them. (I also can't reproduce this with identical arguments.) If you build without LLAMA_CUBLAS=1 do you still get the error?

@atopheim
Copy link
Author

atopheim commented Sep 7, 2023

Building just using make LLAMA_DEBUG=1 works

The following is trying the same with LLAMA_CUBLAS=1 as well, then running gdb on that.

$ rm main
$ make LLAMA_CUBLAS=1 LLAMA_DEBUG=1

I llama.cpp build info:
I UNAME_S: Linux
I UNAME_P: unknown
I UNAME_M: x86_64
I CFLAGS: -I. -Icommon -DGGML_USE_K_QUANTS -DGGML_USE_CUBLAS -I/usr/local/cuda/include -I/opt/cuda/include -I/targets/x86_64-linux/include -O3 -std=c11 -fPIC -O0 -g -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -Wmissing-prototypes -Werror=implicit-int -Wno-unused-function -pthread -march=native -mtune=native
I CXXFLAGS: -I. -Icommon -DGGML_USE_K_QUANTS -DGGML_USE_CUBLAS -I/usr/local/cuda/include -I/opt/cuda/include -I/targets/x86_64-linux/include -O3 -std=c++11 -fPIC -O0 -g -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -march=native -mtune=native
I LDFLAGS: -g -lcublas -lculibos -lcudart -lcublasLt -lpthread -ldl -lrt -L/usr/local/cuda/lib64 -L/opt/cuda/lib64 -L/targets/x86_64-linux/lib
I CC: cc (GCC) 13.2.1 20230801
I CXX: g++ (GCC) 12.3.0

g++ -I. -Icommon -DGGML_USE_K_QUANTS -DGGML_USE_CUBLAS -I/usr/local/cuda/include -I/opt/cuda/include -I/targets/x86_64-linux/include -O3 -std=c++11 -fPIC -O0 -g -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -march=native -mtune=native examples/main/main.cpp ggml.o llama.o common.o console.o grammar-parser.o k_quants.o ggml-cuda.o ggml-alloc.o -o main -g -lcublas -lculibos -lcudart -lcublasLt -lpthread -ldl -lrt -L/usr/local/cuda/lib64 -L/opt/cuda/lib64 -L/targets/x86_64-linux/lib

==== Run ./main -h for help. ====

$ gdb ./main
(gdb) set args -m ./models/34B/codellama-34b.Q5_K_M.gguf -p "The following are three of my mostly used scripts for automating parts of my day:"
(gdb) run
Starting program: /home/torbjorn/Documents/Github/atopheim/llama.cpp/main -m ./models/34B/codellama-34b.Q5_K_M.gguf -p "The following are three of my mostly used scripts for automating parts of my day:"

This GDB supports auto-downloading debuginfo from the following URLs:
https://debuginfod.archlinux.org
Enable debuginfod for this session? (y or [n]) y
Debuginfod has been enabled.
To make this setting permanent, add 'set debuginfod enabled on' to .gdbinit.
Downloading separate debug info for /opt/cuda/lib64/libcublas.so.12
Downloading separate debug info for /opt/cuda/lib64/libcudart.so.12
Downloading separate debug info for /opt/cuda/lib64/libcublasLt.so.12
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
Downloading separate debug info for /usr/lib/librt.so.1
Downloading separate debug info for /usr/lib/libpthread.so.0
Downloading separate debug info for /usr/lib/libdl.so.2
Downloading separate debug info for /usr/lib/libcuda.so.1

Program received signal SIGSEGV, Segmentation fault.
0x00007fffceb5d661 in ?? () from /usr/lib/libc.so.6
(gdb) backtrace
0 0x00007fffceb5d661 in ?? () from /usr/lib/libc.so.6
1 0x00007fffcef5a8a4 in std::char_traits::copy (__n=, __s2=0x7fffffffdc49 "./models/34B/codellama-34b.Q5_K_M.gguf", __s1=0x0) at /usr/src/debug/gcc/gcc-build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/char_traits.h:445
2 std::__cxx11::basic_string<char, std::char_traits, std::allocator >::_S_copy (__n=, __s=0x7fffffffdc49 "./models/34B/codellama-34b.Q5_K_M.gguf", __d=0x0)
at /usr/src/debug/gcc/gcc-build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/basic_string.h:420
3 std::__cxx11::basic_string<char, std::char_traits, std::allocator >::_S_copy (__n=, __s=0x7fffffffdc49 "./models/34B/codellama-34b.Q5_K_M.gguf", __d=0x0)
at /usr/src/debug/gcc/gcc-build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/basic_string.h:415
4 std::__cxx11::basic_string<char, std::char_traits, std::allocator >::_M_replace (this=0x7fffffffc0c0, __pos=, __len1=, __s=0x7fffffffdc49 "./models/34B/codellama-34b.Q5_K_M.gguf", __len2=)
at /usr/src/debug/gcc/gcc-build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/basic_string.tcc:537
5 0x0000555555602c9c in gpt_params_parse(int, char**, gpt_params&) ()
6 0x00005555555688e3 in main (argc=5, argv=0x7fffffffd778) at examples/main/main.cpp:112

@KerfuffleV2
Copy link
Collaborator

KerfuffleV2 commented Sep 7, 2023

I think you need to make clean then try make LLAMA_CUBLAS=1 LLAMA_DEBUG=1 again. Also, please mention exactly what commit/release you're on.

@staviq
Copy link
Contributor

staviq commented Sep 7, 2023

@atopheim Does it still crash if you simply run ./main with no arguments ? ( #2922 )

@cebtenzzre
Copy link
Collaborator

cebtenzzre commented Sep 7, 2023

If you build with this command, it will provide a detailed explanation for segfaults:

LDFLAGS='-fsanitize=address' CFLAGS="$LDFLAGS -fno-omit-frame-pointer" CXXFLAGS=$CFLAGS make LLAMA_DEBUG=1 LLAMA_CUBLAS=1

I would also be interested to know if this occurs with a smaller model like LLaMA v2 7b that would be easier to test.

@atopheim
Copy link
Author

atopheim commented Sep 7, 2023

I have verified that make clean and rebuilding actually worked.

@atopheim
Copy link
Author

atopheim commented Sep 7, 2023

While I had the issue, it was also occuring with 7B models

@cebtenzzre cebtenzzre closed this as not planned Won't fix, can't repro, duplicate, stale Sep 7, 2023
@atopheim
Copy link
Author

atopheim commented Sep 7, 2023

I never tried running it without arguments, @staviq

@cebtenzzre
Copy link
Collaborator

Could you confirm that building with make clean && make LLAMA_CUBLAS=1 (no DEBUG) also works?

@JamyDon
Copy link

JamyDon commented Sep 17, 2023

First I built with simply make and found that it would only run on my CPU. So I went on to build with make LLAMA_CUBLAS=1 and then got Segfault when running the model. Following this issue, I tried make clean && make LLAMA_CUBLAS=1 and it is now working pretty well.

So the Segfault might be because of the remains of the previous make, since after simply make clean the Segfault never emerges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants