Skip to content

Error building cpu_adam #2268

Closed
Closed

Description

Hi,

I apologize if it is a duplicate issue. I just pip installed deepspeed with pytorch 1.11. But I am still having issues with cpu_adam.

python -c "import deepspeed; deepspeed.ops.op_builder.CPUAdamBuilder().load() "
Installed CUDA version 10.0 does not match the version torch was compiled with 10.2 but since the APIs are compatible, accepting this combination
Using /home/ubuntu/.cache/torch_extensions/py38_cu102 as PyTorch extensions root...
Detected CUDA files, patching ldflags
Emitting ninja build file /home/ubuntu/.cache/torch_extensions/py38_cu102/cpu_adam/build.ninja...
Building extension module cpu_adam...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/2] c++ -MMD -MF cpu_adam.o.d -DTORCH_EXTENSION_NAME=cpu_adam -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/deepspeed/ops/csrc/includes -I/usr/local/cuda/include -isystem /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/include -isystem /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/include/TH -isystem /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /usr/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -O3 -std=c++14 -g -Wno-reorder -L/usr/local/cuda/lib64 -lcudart -lcublas -g -march=native -fopenmp -D__AVX256__ -c /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/deepspeed/ops/csrc/adam/cpu_adam.cpp -o cpu_adam.o 
FAILED: cpu_adam.o 
c++ -MMD -MF cpu_adam.o.d -DTORCH_EXTENSION_NAME=cpu_adam -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/deepspeed/ops/csrc/includes -I/usr/local/cuda/include -isystem /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/include -isystem /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/include/TH -isystem /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /usr/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -O3 -std=c++14 -g -Wno-reorder -L/usr/local/cuda/lib64 -lcudart -lcublas -g -march=native -fopenmp -D__AVX256__ -c /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/deepspeed/ops/csrc/adam/cpu_adam.cpp -o cpu_adam.o 
In file included from /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/include/torch/csrc/Device.h:3:0,
                 from /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/include/torch/csrc/api/include/torch/python.h:8,
                 from /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/include/torch/extension.h:6,
                 from /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/deepspeed/ops/csrc/adam/cpu_adam.cpp:5:
/home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/include/torch/csrc/python_headers.h:10:10: fatal error: Python.h: No such file or directory
 #include <Python.h>
          ^~~~~~~~~~
compilation terminated.
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
  File "/home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1740, in _run_ninja_build
    subprocess.run(
  File "/usr/lib/python3.8/subprocess.py", line 516, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/deepspeed/ops/op_builder/builder.py", line 470, in load
    return self.jit_load(verbose)
  File "/home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/deepspeed/ops/op_builder/builder.py", line 512, in jit_load
    op_module = load(
  File "/home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1144, in load
    return _jit_compile(
  File "/home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1357, in _jit_compile
    _write_ninja_file_and_build_library(
  File "/home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1469, in _write_ninja_file_and_build_library
    _run_ninja_build(
  File "/home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1756, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error building extension 'cpu_adam'

What is the easiest way to solve it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions