You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The test TMVA-DNN-LSTM-BackpropagationCudnn crashes on ubuntu2404 cuda-12.6.1 with cudnn with the following stack trace:
0x00007fda7f0b5540 in <unknown> from /usr/lib64/libcuda.so.1
0x00007fda7ed1491e in <unknown> from /usr/lib64/libcuda.so.1
0x00007fda7f08f040 in <unknown> from /usr/lib64/libcuda.so.1
0x00007fda7ed0ef22 in <unknown> from /usr/lib64/libcuda.so.1
0x00007fda7eed2bae in <unknown> from /usr/lib64/libcuda.so.1
0x00007fdaaa248b01 in <unknown> from /usr/local/cuda-12.6/targets/x86_64-linux/lib/libcudart.so.12
0x00007fdaaa218baa in <unknown> from /usr/local/cuda-12.6/targets/x86_64-linux/lib/libcudart.so.12
0x00007fdaaa270721 in cudaMemcpy + 0x211 from /usr/local/cuda-12.6/targets/x86_64-linux/lib/libcudart.so.12
0x000055d25af29e37 in bool testLSTMBackpropagation<TMVA::DNN::TCudnn<double> >(unsigned long, unsigned long, unsigned long, unsigned long, TMVA::DNN::TCudnn<double>::Scalar_t, std::vector<bool, std::allocator<bool> >, bool) + 0x4d37 from /github/home/ROOT-CI/build/tmva/tmva/test/DNN/LSTM/testLSTMBackpropagationCudnn
for (size_t l = 0; l < (size_t) XArch[i].GetNrows(); ++l) {
for (size_t m = 0; m < (size_t) XArch[i].GetNcols(); ++m) {
mat(l, m) = gRandom->Uniform(-1,1);
//XArch[i](0, 0) = 0.5;
//XArch[i](1, 0) = 0.5;
}
}
}
}
Which triggers a cuda_memcpy to the GPU. The crash happens somewhere in the cuda library. Other cudnn tests work, so the problem is not necessarily a broken installation.
Check duplicate issues.
Description
The test
TMVA-DNN-LSTM-BackpropagationCudnn
crashes on ubuntu2404 cuda-12.6.1 with cudnn with the following stack trace:Specifically, it's the assignment in this loop:
root/tmva/tmva/test/DNN/LSTM/TestLSTMBackpropagation.h
Lines 149 to 159 in 9d876cd
Which triggers a cuda_memcpy to the GPU. The crash happens somewhere in the cuda library. Other cudnn tests work, so the problem is not necessarily a broken installation.
Reproducer
ROOT version
Master
Installation method
Source
Operating system
ubuntu24 docker container with cuda 12.6.1
Additional context
No response
The text was updated successfully, but these errors were encountered: