vLLM Release 0.3.0 fails to install on AMD Instinct MI300X with ROCm 6.0.2

Reproducing steps:

1. Clone the vllm repo and switch to [tag v0.3.0](https://github.com/vllm-project/vllm/tree/v0.3.0)
2. Build the Dockerfile.rocm dockerfile with instructions from [Option 3: Build from source with docker -Installation with ROCm](https://docs.vllm.ai/en/latest/getting_started/amd-installation.html#build-from-source-docker-rocm)
   * The build arguments are kept default for first test.

The build fails with [installing vllm](https://github.com/vllm-project/vllm/blob/v0.3.0/Dockerfile.rocm#L78). 

build command:

```sh
docker build  -f Dockerfile.rocm -t vllm-rocm .
```

The error below

```sh
...
 conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
18.57 WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for:
18.57     PyTorch 2.1.1+cu121 with CUDA 1201 (you have 2.1.1+git011de5c)
18.57     Python  3.9.18 (you have 3.9.18)
18.57   Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers)
18.57   Memory-efficient attention, SwiGLU, sparse and more won't be available.
18.57   Set XFORMERS_MORE_DETAILS=1 for more details
20.52 WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for:
20.52     PyTorch 2.1.1+cu121 with CUDA 1201 (you have 2.1.1+git011de5c)
20.52     Python  3.9.18 (you have 3.9.18)
20.52   Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers)
20.52   Memory-efficient attention, SwiGLU, sparse and more won't be available.
20.52   Set XFORMERS_MORE_DETAILS=1 for more details
22.58 WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for:
22.58     PyTorch 2.1.1+cu121 with CUDA 1201 (you have 2.1.1+git011de5c)
22.58     Python  3.9.18 (you have 3.9.18)
22.58   Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers)
22.58   Memory-efficient attention, SwiGLU, sparse and more won't be available.
22.58   Set XFORMERS_MORE_DETAILS=1 for more details
23.37 XFORMERS_FMHA_FLASH_PATH = /opt/conda/envs/py_3.9/lib/python3.9/site-packages/xformers/ops/fmha/flash.py
23.37 XFORMERS_FMHA_COMMON_PATH = /opt/conda/envs/py_3.9/lib/python3.9/site-packages/xformers/ops/fmha/common.py
23.37 6 out of 6 hunks FAILED
23.37 Applying patch to /opt/conda/envs/py_3.9/lib/python3.9/site-packages/xformers/ops/fmha/flash.py
23.37 patching file /opt/conda/envs/py_3.9/lib/python3.9/site-packages/xformers/ops/fmha/flash.py
23.37 Successfully patch /opt/conda/envs/py_3.9/lib/python3.9/site-packages/xformers/ops/fmha/flash.py
23.37 1 out of 1 hunk FAILED
23.37 Applying patch to /opt/conda/envs/py_3.9/lib/python3.9/site-packages/xformers/ops/fmha/common.py
23.37 patching file /opt/conda/envs/py_3.9/lib/python3.9/site-packages/xformers/ops/fmha/common.py
23.37 Successfully patch /opt/conda/envs/py_3.9/lib/python3.9/site-packages/xformers/ops/fmha/common.py
25.15 No CUDA runtime is found, using CUDA_HOME='/usr'
25.17 Traceback (most recent call last):
25.17   File "/app/vllm/setup.py", line 295, in <module>
25.17     raise RuntimeError(
25.17 RuntimeError: Only the following arch is supported: {'gfx908', 'gfx90a', 'gfx1100', 'gfx906', 'gfx1030'}amdgpu_arch_found: gfx941
------
Dockerfile.rocm:78
--------------------
  77 |
  78 | >>> RUN cd /app \
  79 | >>>     && cd vllm \
  80 | >>>     && pip install -U -r requirements-rocm.txt \
  81 | >>>     && bash patch_xformers.rocm.sh \
  82 | >>>     && python3 setup.py install \
  83 | >>>     && cd ..
  84 |
--------------------
ERROR: failed to solve: process "/bin/sh -c cd /app     && cd vllm     && pip install -U -r requirements-rocm.txt     && bash patch_xformers.rocm.sh     && python3 setup.py install     && cd .." did not complete successfully: exit code: 1
```

**Issues:**

1. The [setup.py](https://github.com/vllm-project/vllm/blob/v0.3.0/setup.py#L22) for tag 0.3.0 doesn't seems to be support for the gfx941 or gfx942 (MI300 series)
2. The build argument on [Option 3: Build from source with docker -Installation with ROCm](https://docs.vllm.ai/en/latest/getting_started/amd-installation.html#build-from-source-docker-rocm) `FX_GFX_ARCHS` seems to be ignored on the Dockerfile.rocm. Instead the dockerfile expects [`FA_GFX_ARCHS`](https://github.com/vllm-project/vllm/blob/v0.3.0/Dockerfile.rocm#L17)




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

vLLM Release 0.3.0 fails to install on AMD Instinct MI300X with ROCm 6.0.2 #2865

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

vLLM Release 0.3.0 fails to install on AMD Instinct MI300X with ROCm 6.0.2 #2865

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions