-
Notifications
You must be signed in to change notification settings - Fork 10.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CANN Support Ascend310P to accelerate F32 and F16 LLM Model #10216
Conversation
我这边采用 https://github.com/leo-pony/llama.cpp/blob/ascend310PAdapt/ggml/src/ggml-cann/kernels/quantize_f16_q8_0.cpp ascend310Adaptor分支的代码,在310P上运行Qwen2.5-7b-fp16.guff 模型执行推理,结果为乱码,不知道是还未支持该模型还是有什么别的原因? |
Compile option should with -DSOC_TYPE, such as: |
我已在 ggml/src/ggml-cann/kernels/CMakeLists.txt 文件中将未设置 SOC_TYPE 时,自动将 SOC_TYPE 设置为ascend310P3了
if (NOT SOC_TYPE)
set (SOC_TYPE "ascend310p3")
endif()
在 2024-11-11 14:41:45,"leo-pony" ***@***.***> 写道:
我这边采用 https://github.com/leo-pony/llama.cpp/blob/ascend310PAdapt/ggml/src/ggml-cann/kernels/quantize_f16_q8_0.cpp ascend310Adaptor分支的代码,在310P上运行Qwen2.5-7b-fp16.guff 模型执行推理,结果为乱码,不知道是还未支持该模型还是有什么别的原因?
Compile option should with -DSOC_TYPE, such as:
cmake -B build -DGGML_CANN=on -DCMAKE_BUILD_TYPE=debug -DSOC_TYPE=Ascend310P3
cmake --build build --config debug
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
|
您好!我在使用https://github.com/leo-pony/llama.cpp中的内容进行编译安装时出现错误。
cmake -B build -DGGML_CANN=on -DCMAKE_BUILD_TYPE=Debug -DSOC_TYPE=Ascend310P3 -DCMAKE_C_COMPILER=/opt/gcc-11.2.0/bin/gcc -DCMAKE_CXX_COMPILER=/opt/gcc-11.2.0/bin/g++
cmake --build build --config debug
Building CXX object CMakeFiles/device_aic_obj.dir/home/cqzstl/work/dbw/pr_llama_cpp/llama.cpp/build/auto_gen/ascendc_kernels/auto_gen_get_row_q4_0.cpp.o
In file included from /home/cqzstl/work/dbw/pr_llama_cpp/llama.cpp/build/auto_gen/ascendc_kernels/auto_gen_get_row_q4_0.cpp:7:
In file included from /home/cqzstl/work/dbw/pr_llama_cpp/llama.cpp/ggml/src/ggml-cann/kernels/get_row_q4_0.cpp:1:
In file included from /usr/local/Ascend/ascend-toolkit/latest/tools/tikcpp/tikcfw/kernel_operator.h:27:
In file included from /usr/local/Ascend/ascend-toolkit/latest/tools/tikcpp/tikcfw/interface/kernel_operator_intf.h:48:
In file included from /usr/local/Ascend/ascend-toolkit/latest/tools/tikcpp/tikcfw/interface/kernel_operator_vec_vconv_intf.h:28:
/usr/local/Ascend/ascend-toolkit/latest/tools/tikcpp/tikcfw/impl/dav_m200/kernel_operator_vec_vconv_impl.h:455:5: error: no matching function for call to 'CastIntrinsicsImpl'
CastIntrinsicsImpl(dst, src, roundMode, 1, repeatParams);
^~~~~~~~~~~~~~~~~~
/usr/local/Ascend/ascend-toolkit/latest/tools/tikcpp/tikcfw/inner_interface/inner_kernel_operator_vec_vconv_intf.cppm:131:5: note: in instantiation of function template specialization 'AscendC::CastImpl<half, AscendC::IntegerSubType<4, true>>' requested here |
It seems -DSOC_TYPE=Ascend310P3 doesn't take effect. Plz check your code is the same with this PR. In this PR there is no Cast call for 310P3. If SOC_TYPE been set to Ascend310PX, macro ASCEND_310P would been defined. |
切换到ascend310PAdapt分支,重新执行编译命令之后出现新的错误,关键错误信息如下: [ 51%] Built target test-model-load-cancel
Consolidate compiler generated dependencies of target test-autorelease
[ 51%] Linking CXX executable ../bin/test-autorelease
[ 52%] Built target test-autorelease
Consolidate compiler generated dependencies of target test-json-schema-to-grammar
[ 53%] Linking CXX executable ../bin/test-json-schema-to-grammar
[ 54%] Built target test-json-schema-to-grammar
Consolidate compiler generated dependencies of target test-c
[ 55%] Linking C executable ../bin/test-c
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::random_device::_M_getentropy() const@GLIBCXX_3.4.25'
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::__cxx11::basic_stringstream<char, std::char_traits<char>, std::allocator<char> >::basic_stringstream()@GLIBCXX_3.4.26'
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::__exception_ptr::exception_ptr::_M_release()@CXXABI_1.3.13'
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::data()@GLIBCXX_3.4.26'
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::__throw_bad_array_new_length()@GLIBCXX_3.4.29'
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> >::data()@GLIBCXX_3.4.26'
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::__exception_ptr::exception_ptr::_M_addref()@CXXABI_1.3.13'
collect2: 错误:ld 返回 1
gmake[2]: *** [tests/CMakeFiles/test-c.dir/build.make:99:bin/test-c] 错误 1
gmake[1]: *** [CMakeFiles/Makefile2:2490:tests/CMakeFiles/test-c.dir/all] 错误 2
gmake: *** [Makefile:146:all] 错误 2 |
It seems has dirty files, Delete build directory, and retry may been could handle this problem. |
重新编译后出现同样的问题,以下是我执行的完整命令: git clone https://github.com/leo-pony/llama.cpp.git
cd llama.cpp
git checkout ascend310PAdapt
cmake -B build -DGGML_CANN=on -DCMAKE_BUILD_TYPE=Debug -DSOC_TYPE=Ascend310P3 -DCMAKE_C_COMPILER=/opt/gcc-11.2.0/bin/gcc -DCMAKE_CXX_COMPILER=/opt/gcc-11.2.0/bin/g++
cmake --build build --config debug 再执行完 cmake -B build -DGGML_CANN=on -DCMAKE_BUILD_TYPE=Debug -DSOC_TYPE=Ascend310P3 -DCMAKE_C_COMPILER=/opt/gcc-11.2.0/bin/gcc -DCMAKE_CXX_COMPILER=/opt/gcc-11.2.0/bin/g++ 之后的输出为: -- The C compiler identification is GNU 11.2.0
-- The CXX compiler identification is GNU 11.2.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /opt/gcc-11.2.0/bin/gcc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /opt/gcc-11.2.0/bin/g++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found Git: /usr/bin/git (found version "2.27.0")
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Check if compiler accepts -pthread
-- Check if compiler accepts -pthread - yes
-- Found Threads: TRUE
-- Found OpenMP_C: -fopenmp (found version "4.5")
-- Found OpenMP_CXX: -fopenmp (found version "4.5")
-- Found OpenMP: TRUE (found version "4.5")
-- OpenMP found
-- Using llamafile
-- Using AMX
-- CANN: updated CANN_INSTALL_DIR from ASCEND_TOOLKIT_HOME=/usr/local/Ascend/ascend-toolkit/latest
-- Compile for Ascend310P.
-- CANN: CANN_INCLUDE_DIRS = /usr/local/Ascend/ascend-toolkit/latest/include;/usr/local/Ascend/ascend-toolkit/latest/include/aclnn;/usr/local/Ascend/ascend-toolkit/latest/acllib/include
-- CANN: CANN_LIBRARIES = ascendcl;nnopbase;opapi;acl_op_compiler;ascendc_kernels
-- Warning: ccache not found - consider installing it for faster compilation or disable this warning with GGML_CCACHE=OFF
-- CMAKE_SYSTEM_PROCESSOR: x86_64
-- x86 detected
-- Configuring done
-- Generating done
-- Build files have been written to: /home/xxx/llama.cpp/build 执行 cmake --build build --config debug 发生错误,主要错误如下: [ 48%] Built target test-backend-ops
[ 48%] Building CXX object tests/CMakeFiles/test-rope.dir/test-rope.cpp.o
[ 48%] Building CXX object tests/CMakeFiles/test-rope.dir/get-model.cpp.o
[ 49%] Linking CXX executable ../bin/test-rope
[ 49%] Built target test-rope
[ 50%] Building CXX object tests/CMakeFiles/test-model-load-cancel.dir/test-model-load-cancel.cpp.o
[ 50%] Building CXX object tests/CMakeFiles/test-model-load-cancel.dir/get-model.cpp.o
[ 51%] Linking CXX executable ../bin/test-model-load-cancel
[ 51%] Built target test-model-load-cancel
[ 51%] Building CXX object tests/CMakeFiles/test-autorelease.dir/test-autorelease.cpp.o
[ 52%] Building CXX object tests/CMakeFiles/test-autorelease.dir/get-model.cpp.o
[ 52%] Linking CXX executable ../bin/test-autorelease
[ 52%] Built target test-autorelease
[ 53%] Building CXX object tests/CMakeFiles/test-json-schema-to-grammar.dir/test-json-schema-to-grammar.cpp.o
[ 53%] Building CXX object tests/CMakeFiles/test-json-schema-to-grammar.dir/get-model.cpp.o
[ 54%] Linking CXX executable ../bin/test-json-schema-to-grammar
[ 54%] Built target test-json-schema-to-grammar
[ 54%] Building C object tests/CMakeFiles/test-c.dir/test-c.c.o
[ 55%] Linking C executable ../bin/test-c
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::random_device::_M_getentropy() const@GLIBCXX_3.4.25'
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::__cxx11::basic_stringstream<char, std::char_traits<char>, std::allocator<char> >::basic_stringstream()@GLIBCXX_3.4.26'
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::__exception_ptr::exception_ptr::_M_release()@CXXABI_1.3.13'
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::data()@GLIBCXX_3.4.26'
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::__throw_bad_array_new_length()@GLIBCXX_3.4.29'
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> >::data()@GLIBCXX_3.4.26'
/usr/bin/ld: ../src/libllama.so: undefined reference to `std::__exception_ptr::exception_ptr::_M_addref()@CXXABI_1.3.13'
collect2: 错误:ld 返回 1
gmake[2]: *** [tests/CMakeFiles/test-c.dir/build.make:99:bin/test-c] 错误 1
gmake[1]: *** [CMakeFiles/Makefile2:2490:tests/CMakeFiles/test-c.dir/all] 错误 2
gmake: *** [Makefile:146:all] 错误 2 |
Plz check wether your can build basic CANN C++ applications: |
b0700ae
to
7f5efeb
Compare
7f5efeb
to
6327369
Compare
我修复并且能够正常编译样例之后,拉去最新的310PAdapt分支,进行编译出现了新的错误: -- The C compiler identification is GNU 7.3.0
-- The CXX compiler identification is GNU 7.3.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Configuring done
-- Generating done
-- Build files have been written to: /home/cqzstl/work/dbw/pr_llama_cpp/llama.cpp/build/ggml/src/ggml-cann/kernels/ascendc_kernels_aic_device-prefix/src/ascendc_kernels_aic_device-build
[ 15%] Performing build step for 'ascendc_kernels_aic_device'
[ 12%] Building CXX object CMakeFiles/device_aic_obj.dir/home/cqzstl/work/dbw/pr_llama_cpp/llama.cpp/build/auto_gen/ascendc_kernels/auto_gen_dup.cpp.o
In file included from /home/cqzstl/work/dbw/pr_llama_cpp/llama.cpp/build/auto_gen/ascendc_kernels/auto_gen_dup.cpp:10:
In file included from /home/cqzstl/work/dbw/pr_llama_cpp/llama.cpp/ggml/src/ggml-cann/kernels/dup.cpp:1:
In file included from /usr/local/Ascend/ascend-toolkit/latest/tools/tikcpp/tikcfw/kernel_operator.h:28:
In file included from /usr/local/Ascend/ascend-toolkit/latest/tools/tikcpp/tikcfw/inner_interface/inner_kernel_operator_intf.h:25:
/usr/local/Ascend/ascend-toolkit/latest/tools/tikcpp/tikcfw/inner_interface/inner_kernel_operator_dump_tensor_intf.cppm:138:24: error: (8th) const argument is in address space __gm__, but parameter must be in Local Memory
__aicore__ inline void AssertImpl(__gm__ const char* fmt, Args&&... args)
^
1 error generated.
gmake[5]: *** [CMakeFiles/device_aic_obj.dir/build.make:76:CMakeFiles/device_aic_obj.dir/home/cqzstl/work/dbw/pr_llama_cpp/llama.cpp/build/auto_gen/ascendc_kernels/auto_gen_dup.cpp.o] 错误 1
gmake[4]: *** [CMakeFiles/Makefile2:85:CMakeFiles/device_aic_obj.dir/all] 错误 2
gmake[3]: *** [Makefile:91:all] 错误 2
gmake[2]: *** [ggml/src/ggml-cann/kernels/CMakeFiles/ascendc_kernels_aic_device.dir/build.make:86:ggml/src/ggml-cann/kernels/ascendc_kernels_aic_device-prefix/src/ascendc_kernels_aic_device-stamp/ascendc_kernels_aic_device-build] 错误 2
gmake[1]: *** [CMakeFiles/Makefile2:1860:ggml/src/ggml-cann/kernels/CMakeFiles/ascendc_kernels_aic_device.dir/all] 错误 2
gmake: *** [Makefile:146:all] 错误 2 |
We don't have Kylin OS + x86 environment. Cann't reproduce your problem. Plz check whether 310P is fully supported for your environment. |
|
I think this PR support both of them. |
After encountering the above issues, I consulted with Huawei staff, and their response was that the biggest difference between the 310I Pro (single-core 310p) and the 310I Duo (dual-core 310p) is that the former cannot perform inference LLM. If it is convenient, cloud you plz verify this PR on the 310I Pro (single-core 310p)?Thank you very much! |
Sorry, We only have 310I Duo currently. |
Could your support the contact information of the person who your get the information that 310I Pro (single-core 310p) doesn't not support inference LLM? We want to know some detail information. |
…0216) * CANN Support Ascend310P to accelerate F32 and F16 Model * Add compile option soc type macro ASCEND_310P to ggml-cann lib * Remove unused code * Remove the ascend soc_type hard code compile option in CMakelist.txt
CANN Support Ascend310P to accelerate F32/F16 model inferencing. Corresponding issue is #10160. Q8 and Q4 will implement next.
Function is normal: