Skip to content

Bug: A crash occurs when llama-bench is running on multiple cann devices. #9250

Closed
@znzjugod

Description

@znzjugod

What happened?

when i use Llama3-8B-Chinese-Chat-f16-v2_1.gguf to run llama.cpp, here is a crash:
here is my cmd:
./llama-cli -m /home/c00662745/llama3/llama3/llama3_chinese_gguf/Llama3-8B-Chinese-Chat-f16-v2_1.gguf -p "Building a website can be done in 10 simple steps:" -n 400 -e -ngl 33 -sm layer

here is the error:
’‘’
CANN error: EE9999: Inner Error!
EE9999: [PID: 2750884] 2024-08-30-16:20:38.196.490 Stream destroy failed, stream is not in current ctx, stream_id=2.[FUNC:StreamDestroy][FILE:api_impl.cc][LINE:1032]
TraceBack (most recent call last):
rtStreamDestroy execute failed, reason=[stream not in current context][FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:53]
destroy stream failed, runtime result = 107003[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:161]

current device: 1, in function ~ggml_backend_cann_context at /home/zn/new-llama/llama.cpp/ggml/src/ggml-cann/common.h:235
aclrtDestroyStream(streams[i])
/home/zn/new-llama/llama.cpp/ggml/src/ggml-cann.cpp:123: CANN error
[New LWP 2750924]
[New LWP 2750937]
[New LWP 2753277]
[New LWP 2753281]
[New LWP 2753615]
[New LWP 2753616]
[New LWP 2753623]
[New LWP 2753626]
[New LWP 2753900]
[New LWP 2753901]
[New LWP 2757030]
[New LWP 2757031]
[New LWP 2757032]
[New LWP 2757033]
[New LWP 2757034]
[New LWP 2757035]
[New LWP 2757036]
[New LWP 2757037]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib64/libthread_db.so.1".
0x0000ffff8e7edc00 in wait4 () from /usr/lib64/libc.so.6
#0 0x0000ffff8e7edc00 in wait4 () from /usr/lib64/libc.so.6
#1 0x0000ffff8ec019f0 in ggml_print_backtrace () at /home/zn/new-llama/llama.cpp/ggml/src/ggml.c:253
253 waitpid(pid, &wstatus, 0);
#2 0x0000ffff8ec01b20 in ggml_abort (file=0xffff8eccfd58 "/home/zn/new-llama/llama.cpp/ggml/src/ggml-cann.cpp", line=123, fmt=0xffff8eccfd48 "CANN error") at /home/zn/new-llama/llama.cpp/ggml/src/ggml.c:280
280 ggml_print_backtrace();
#3 0x0000ffff8ec94ab8 in ggml_cann_error (stmt=0xffff8eccfcb0 "aclrtDestroyStream(streams[i])", func=0xffff8eccfc70 "~ggml_backend_cann_context", file=0xffff8eccfc18 "/home/zn/new-llama/llama.cpp/ggml/src/ggml-cann/common.h", line=235, msg=0x3bcc0668 "EE9999: Inner Error!\nEE9999: [PID: 2750884] 2024-08-30-16:20:38.196.490 Stream destroy failed, stream is not in current ctx, stream_id=2.[FUNC:StreamDestroy][FILE:api_impl.cc][LINE:1032]\n Trace"...) at /home/zn/new-llama/llama.cpp/ggml/src/ggml-cann.cpp:123
warning: Source file is more recent than executable.
123 GGML_ABORT("CANN error");
#4 0x0000ffff8ec97b74 in ggml_backend_cann_context::~ggml_backend_cann_context (this=0x33af8680, __in_chrg=) at /home/zn/new-llama/llama.cpp/ggml/src/ggml-cann/common.h:235
235 ACL_CHECK(aclrtDestroyStream(streams[i]));
#5 0x0000ffff8ec964ac in ggml_backend_cann_free (backend=0x2a8d71a0) at /home/zn/new-llama/llama.cpp/ggml/src/ggml-cann.cpp:1412
1412 delete cann_ctx;
#6 0x0000ffff8ec49394 in ggml_backend_free (backend=0x2a8d71a0) at /home/zn/new-llama/llama.cpp/ggml/src/ggml-backend.c:180
180 backend->iface.free(backend);
#7 0x0000ffff8f18a30c in llama_context::~llama_context (this=0x29fc9fc0, __in_chrg=) at /home/zn/new-llama/llama.cpp/src/llama.cpp:3069
3069 ggml_backend_free(backend);
#8 0x0000ffff8f16b744 in llama_free (ctx=0x29fc9fc0) at /home/zn/new-llama/llama.cpp/src/llama.cpp:17936
17936 delete ctx;
#9 0x0000000000476d48 in main (argc=12, argv=0xfffffc7fe828) at /home/zn/new-llama/llama.cpp/examples/main/main.cpp:1020
1020 llama_free(ctx);
[Inferior 1 (process 2750884) detached]
Aborted (core dumped)
‘’‘

seems like in the final stream free,cann did't get the right ctx id.

Name and Version

(base) [root@localhost bin]# ./llama-cli --version
version: 3645 (7ea8d80)
built with cc (GCC) 10.3.1 for aarch64-linux-gnu

What operating system are you seeing the problem on?

No response

Relevant log output

No response

Metadata

Metadata

Assignees

Labels

Ascend NPUissues specific to Ascend NPUsbug-unconfirmedcritical severityUsed to report critical severity bugs in llama.cpp (e.g. Crashing, Corrupted, Dataloss)stale

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions