Skip to content

[Misc] [ROCm]: Build from source failure with Arch/gcc14 with ROCm 6.3 #13777

Open
@arjunkathuria

Description

@arjunkathuria

Anything you want to discuss about vllm.

Hi team!

Been trying to build vllm from source for ROCm 6.3 for gfx1100 on Arch/gcc14 following the instructions from the official documentation. Kept running into a compile error on the hipify step during the build:-

Excerpt from error -

...

In file included from <built-in>:1:
In file included from /opt/rocm/lib/llvm/lib/clang/18/include/__clang_hip_runtime_wrapper.h:145:
In file included from /opt/rocm/lib/llvm/lib/clang/18/include/cuda_wrappers/algorithm:55:
In file included from /usr/lib64/gcc/x86_64-pc-linux-gnu/14.2.1/../../../../include/c++/14.2.1/algorithm:61:
/usr/lib64/gcc/x86_64-pc-linux-gnu/14.2.1/../../../../include/c++/14.2.1/bits/stl_algo.h:3626:7: error: reference to __host__ function '__glibcxx_assert_fail' in __host__ __device__ function
 3626 |       __glibcxx_assert(!(__hi < __lo));
      |       ^
/usr/lib64/gcc/x86_64-pc-linux-gnu/14.2.1/../../../../include/c++/14.2.1/x86_64-pc-linux-gnu/bits/c++config.h:614:12: note: expanded from macro '__glibcxx_assert'
  614 |       std::__glibcxx_assert_fail();                                     \
      |            ^
/home/<username>/Documents/sources/vllm/build/temp.linux-x86_64-cpython-312/csrc/quantization/compressed_tensors/int8_quant_kernels.hip:35:14: note: called by 'float_to_int8_rn'
   35 |   dst = std::clamp(dst, i8_min, i8_max);
      |              ^
/home/<username>/Documents/sources/vllm/build/temp.linux-x86_64-cpython-312/csrc/quantization/compressed_tensors/int8_quant_kernels.hip:119:14: note: called by 'static_scaled_int8_quant_kernel<float, float>'
  119 |     out[i] = float_to_int8_rn(static_cast<float>(input[i]) / scale);
      |              ^
/usr/lib64/gcc/x86_64-pc-linux-gnu/14.2.1/../../../../include/c++/14.2.1/x86_64-pc-linux-gnu/bits/c++config.h:608:3: note: '__glibcxx_assert_fail' declared here
  608 |   __glibcxx_assert_fail()
      |   ^
1 error generated when compiling for gfx1100.

...

On further investigations into why the error, it seems the std::clamp function was the issue. For reasons, this seems to not work when compiling with gcc14/hip-clang.

A bit more looking into this and i found that this is a known issue at pytorch and LLVM projects, see: -

The fix/work-around Pytorch went with for this was replacing std::clamp usage with similar logic (see commit)

I implemented that here and it went on and compiled successfully after sorting out all the offending files/places!

Will submit a PR with the changes soon ✌🏻

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions