You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I am working on updating SD.cpp using the latest ggml. It failed at ggml_cuda_mul_mat_batched_cublas with the following error
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
Device 0: NVIDIA GeForce RTX 4090, compute capability 8.9, VMM: yes
......
CUDA error: the function failed to launch on the GPU
current device: 0, in function ggml_cuda_mul_mat_batched_cublas at /home/isodden/work/C/stable-diffusion.cpp/ggml/src/ggml-cuda.cu:1881
cublasGemmBatchedEx(ctx.cublas_handle(), CUBLAS_OP_T, CUBLAS_OP_N, ne01, ne11, ne10, alpha, (const void **) (ptrs_src.get() + 0*ne23), CUDA_R_16F, nb01/nb00, (const void **) (ptrs_src.get() + 1*ne23), CUDA_R_16F, nb11/nb10, beta, ( void **) (ptrs_dst.get() + 0*ne23), cu_data_type, ne01, ne23, cu_compute_type, CUBLAS_GEMM_DEFAULT_TENSOR_OP)
/home/isodden/work/C/stable-diffusion.cpp/ggml/src/ggml-cuda.cu:101: CUDA error
The two tensors are:
src0: [2048, 8192, 1, 1] type F16
src1: [2048, 2, 4, 1] type F32
After switching to CPU back end, it runs fine with reasonable results.
I am wondering if CUDA backend's mat_mul had issues. In particular, there are multiple paths in ggml_cuda_mul_mat based on various options and GPU architecture. In this case, ```ggml_cuda_mul_mat_batched_cublas`` was picked.
I am seeking a method to do more tests. How can I save a ggml_tensor and reload it somewhere else?
Thanks,
The text was updated successfully, but these errors were encountered:
Hi, I am working on updating
SD.cpp
using the latestggml
. It failed atggml_cuda_mul_mat_batched_cublas
with the following errorThe two tensors are:
src0: [2048, 8192, 1, 1] type F16
src1: [2048, 2, 4, 1] type F32
After switching to CPU back end, it runs fine with reasonable results.
I am wondering if CUDA backend's mat_mul had issues. In particular, there are multiple paths in
ggml_cuda_mul_mat
based on various options and GPU architecture. In this case, ```ggml_cuda_mul_mat_batched_cublas`` was picked.I am seeking a method to do more tests. How can I save a ggml_tensor and reload it somewhere else?
Thanks,
The text was updated successfully, but these errors were encountered: