Skip to content

Dockerfile and CUDA and CUDNN issues with GPU detected #47

@samhodge-aiml

Description

@samhodge-aiml

Trying to build from a CI/CD I got the following
builderror.txt.zip

see zip attached

important parts

-- Found CUDNN: /usr/lib/x86_64-linux-gnu/libcudnn.so  
-- Found cuDNN: v8.9.6  (include: /usr/include, library: /usr/lib/x86_64-linux-gnu/libcudnn.so)
CMake Warning at External/libtorch/share/cmake/Caffe2/public/cuda.cmake:214 (message):
  Failed to compute shorthash for libnvrtc.so
Call Stack (most recent call first):
  External/libtorch/share/cmake/Caffe2/Caffe2Config.cmake:92 (include)
  External/libtorch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
  CMakeLists.txt:78 (find_package)


-- Automatic GPU detection failed. Building for common architectures.
-- Autodetected CUDA architecture(s): 3.5;5.0;5.2;6.0;6.1;7.0;7.5;8.0;8.6;8.6+PTX
-- Added CUDA NVCC flags for: -gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_86,code=compute_86
-- Found Torch: /app/External/libtorch/lib/libtorch.so  
-- Package torch                      Yes, at /app/External/libtorch/include;/app/External/libtorch/include/torch/csrc/api/include
CMAKE_EXE_LINKER_FLAGS before: -Wl,--no-as-needed
TORCH_LIBRARIES: torch;torch_library;/app/External/libtorch/lib/libc10.so;/app/External/libtorch/lib/libkineto.a;/usr/local/cuda/lib64/stubs/libcuda.so;/usr/local/cuda/lib64/libnvrtc.so;/usr/local/cuda/lib64/libnvToolsExt.so;/usr/local/cuda/lib64/libcudart.so;/app/External/libtorch/lib/libc10_cuda.so
TORCH_CXX_FLAGS: -D_GLIBCXX_USE_CXX11_ABI=1
CMAKE_EXE_LINKER_FLAGS after: -Wl,--as-needed
CUDNN_LIBRARY_PATH: /usr/lib/x86_64-linux-gnu/libcudnn.so; CUDNN_INCLUDE_PATH: /usr/include
-- Obtained CUDA architectures automatically from installed GPUs
-- Automatic GPU detection failed. Building for Turing and Ampere as a best guess.
-- Targeting CUDA architectures: 75;86
-- SAIGA_CUDA_VERSION 
-- Found CUDAToolkit: /usr/local/cuda/targets/x86_64-linux/include (found suitable version "11.8.89", minimum required is "10.2") 
-- Enabled CUDA. Version: 11.8.89
-- Package CUDA::cudart               Yes, at /usr/local/cuda/targets/x86_64-linux/include
-- Package CUDA::nppif                Yes, at /usr/local/cuda/targets/x86_64-linux/include
-- Package CUDA::nppig                Yes, at /usr/local/cuda/targets/x86_64-linux/include
-- SAIGA_CUDA_FLAGS: -Xcompiler=-fopenmp;-Xcompiler=-march=native;-use_fast_math;--expt-relaxed-constexpr;-Xcudafe=--diag_suppress=esa_on_defaulted_function_ignored;-Xcudafe=--diag_suppress=field_without_dll_interface;-Xcudafe=--diag_suppress=base_class_has_different_dll_interface;-Xcudafe=--diag_suppress=dll_interface_conflict_none_assumed;-Xcudafe=--diag_suppress=dll_interface_conflict_dllexport_assumed
-- Using automatic CUDA Arch detection...
-- Automatic GPU detection failed. Building for common architectures.
-- Autodetected CUDA architecture(s): 3.5;5.0;5.2;6.0;6.1;7.0;7.5;8.0;8.6;8.6+PTX
-- SAIGA_CUDA_ARCH: 
-- SAIGA_CUDA_ARCH_FLAGS: -gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_86,code=compute_86
-- 
Compiler Flags:
-- SAIGA_CXX_FLAGS: -Wall;-Werror=return-type;-Wno-strict-aliasing;-Wno-sign-compare;-march=native;-fopenmp
-- SAIGA_PRIVATE_CXX_FLAGS: -fvisibility=hidden
-- SAIGA_LD_FLAGS: -fopenmp
-- CMAKE_CXX_FLAGS: 
-- CMAKE_CXX_FLAGS_DEBUG: -g
-- CMAKE_CXX_FLAGS_RELWITHDEBINFO: -O2 -g -DNDEBUG
-- CMAKE_CXX_FLAGS_RELEASE: -O3 -DNDEBUG
-- 
CUDA Compiler Flags:
-- CMAKE_CUDA_FLAGS: 
-- CMAKE_CUDA_FLAGS_DEBUG: -g
-- CMAKE_CUDA_FLAGS_RELWITHDEBINFO: -O2 -g -DNDEBUG
-- CMAKE_CUDA_FLAGS_RELEASE: -O3 -DNDEBUG
[ 17%] Built target signalhandler_unittest
[ 17%] Building CUDA object External/tiny-cuda-nn/CMakeFiles/tiny-cuda-nn.dir/src/common.cu.o
nvcc warning : The 'compute_35', 'compute_37', 'sm_35', and 'sm_37' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
nvcc fatal   : A single input file is required for a non-link phase when an outputfile is specified
make[2]: *** [External/tiny-cuda-nn/CMakeFiles/tiny-cuda-nn.dir/build.make:77: External/tiny-cuda-nn/CMakeFiles/tiny-cuda-nn.dir/src/common.cu.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:2579: External/tiny-cuda-nn/CMakeFiles/tiny-cuda-nn.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions