Closed
Description
Hi,
I apologize if it is a duplicate issue. I just pip installed deepspeed with pytorch 1.11. But I am still having issues with cpu_adam.
python -c "import deepspeed; deepspeed.ops.op_builder.CPUAdamBuilder().load() "
Installed CUDA version 10.0 does not match the version torch was compiled with 10.2 but since the APIs are compatible, accepting this combination
Using /home/ubuntu/.cache/torch_extensions/py38_cu102 as PyTorch extensions root...
Detected CUDA files, patching ldflags
Emitting ninja build file /home/ubuntu/.cache/torch_extensions/py38_cu102/cpu_adam/build.ninja...
Building extension module cpu_adam...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/2] c++ -MMD -MF cpu_adam.o.d -DTORCH_EXTENSION_NAME=cpu_adam -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/deepspeed/ops/csrc/includes -I/usr/local/cuda/include -isystem /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/include -isystem /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/include/TH -isystem /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /usr/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -O3 -std=c++14 -g -Wno-reorder -L/usr/local/cuda/lib64 -lcudart -lcublas -g -march=native -fopenmp -D__AVX256__ -c /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/deepspeed/ops/csrc/adam/cpu_adam.cpp -o cpu_adam.o
FAILED: cpu_adam.o
c++ -MMD -MF cpu_adam.o.d -DTORCH_EXTENSION_NAME=cpu_adam -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/deepspeed/ops/csrc/includes -I/usr/local/cuda/include -isystem /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/include -isystem /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/include/TH -isystem /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /usr/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -O3 -std=c++14 -g -Wno-reorder -L/usr/local/cuda/lib64 -lcudart -lcublas -g -march=native -fopenmp -D__AVX256__ -c /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/deepspeed/ops/csrc/adam/cpu_adam.cpp -o cpu_adam.o
In file included from /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/include/torch/csrc/Device.h:3:0,
from /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/include/torch/csrc/api/include/torch/python.h:8,
from /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/include/torch/extension.h:6,
from /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/deepspeed/ops/csrc/adam/cpu_adam.cpp:5:
/home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/include/torch/csrc/python_headers.h:10:10: fatal error: Python.h: No such file or directory
#include <Python.h>
^~~~~~~~~~
compilation terminated.
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
File "/home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1740, in _run_ninja_build
subprocess.run(
File "/usr/lib/python3.8/subprocess.py", line 516, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/deepspeed/ops/op_builder/builder.py", line 470, in load
return self.jit_load(verbose)
File "/home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/deepspeed/ops/op_builder/builder.py", line 512, in jit_load
op_module = load(
File "/home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1144, in load
return _jit_compile(
File "/home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1357, in _jit_compile
_write_ninja_file_and_build_library(
File "/home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1469, in _write_ninja_file_and_build_library
_run_ninja_build(
File "/home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1756, in _run_ninja_build
raise RuntimeError(message) from e
RuntimeError: Error building extension 'cpu_adam'
What is the easiest way to solve it?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Metadata
Assignees
Labels
No labels