Skip to content

Conversation

@cyb70289
Copy link
Contributor

@cyb70289 cyb70289 commented Oct 13, 2025

Purpose

Support RTX-40xx card (arch=8.9) for Arm64 release.

Test Result

Verified manually on Arm Neoverse-N2 server with RTX-4090 card.

Verified okay on Neoverse-N2 server with RTX-4090.
@mergify mergify bot added the ci/build label Oct 13, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for CUDA architecture 8.9 to the Arm64 release builds, which is necessary for RTX 40-series GPUs. The changes correctly update the torch_cuda_arch_list in the Buildkite pipeline for both wheel building and release image creation. My review feedback focuses on improving the maintainability of this CI configuration by addressing the duplication of the architecture list. I've suggested using a pipeline-level environment variable to define this list once, which will prevent potential inconsistencies and make future updates simpler and less error-prone.

# #NOTE: torch_cuda_arch_list is derived from upstream PyTorch build files here:
# https://github.com/pytorch/pytorch/blob/main/.ci/aarch64_linux/aarch64_ci_build.sh#L7
- "DOCKER_BUILDKIT=1 docker build --build-arg max_jobs=16 --build-arg USE_SCCACHE=1 --build-arg GIT_REPO_CHECK=1 --build-arg CUDA_VERSION=12.9.1 --build-arg VLLM_MAIN_CUDA_VERSION=12.9 --build-arg torch_cuda_arch_list='8.7 9.0 10.0+PTX 12.0' --tag vllm-ci:build-image --target build --progress plain -f docker/Dockerfile ."
- "DOCKER_BUILDKIT=1 docker build --build-arg max_jobs=16 --build-arg USE_SCCACHE=1 --build-arg GIT_REPO_CHECK=1 --build-arg CUDA_VERSION=12.9.1 --build-arg VLLM_MAIN_CUDA_VERSION=12.9 --build-arg torch_cuda_arch_list='8.7 8.9 9.0 10.0+PTX 12.0' --tag vllm-ci:build-image --target build --progress plain -f docker/Dockerfile ."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

To avoid duplicating the torch_cuda_arch_list, you can define it once as a pipeline-level environment variable. This makes future updates easier and less error-prone.

You can add this at the top of your .buildkite/release-pipeline.yaml file (before the steps: block):

env:
  TORCH_CUDA_ARCH_LIST_ARM64: '8.7 8.9 9.0 10.0+PTX 12.0'

Then, you can use this environment variable in this command. The same change should be applied to the docker build command on line 79.

      - "DOCKER_BUILDKIT=1 docker build --build-arg max_jobs=16 --build-arg USE_SCCACHE=1 --build-arg GIT_REPO_CHECK=1 --build-arg CUDA_VERSION=12.9.1 --build-arg VLLM_MAIN_CUDA_VERSION=12.9 --build-arg torch_cuda_arch_list=\"$TORCH_CUDA_ARCH_LIST_ARM64\" --tag vllm-ci:build-image --target build --progress plain -f docker/Dockerfile ."

commands:
- "aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin public.ecr.aws/q9t5s3a7"
- "DOCKER_BUILDKIT=1 docker build --build-arg max_jobs=16 --build-arg USE_SCCACHE=1 --build-arg GIT_REPO_CHECK=1 --build-arg CUDA_VERSION=12.9.1 --build-arg FLASHINFER_AOT_COMPILE=true --build-arg torch_cuda_arch_list='8.7 9.0 10.0+PTX 12.0' --build-arg INSTALL_KV_CONNECTORS=true --tag public.ecr.aws/q9t5s3a7/vllm-release-repo:$BUILDKITE_COMMIT-$(uname -m) --target vllm-openai --progress plain -f docker/Dockerfile ."
- "DOCKER_BUILDKIT=1 docker build --build-arg max_jobs=16 --build-arg USE_SCCACHE=1 --build-arg GIT_REPO_CHECK=1 --build-arg CUDA_VERSION=12.9.1 --build-arg FLASHINFER_AOT_COMPILE=true --build-arg torch_cuda_arch_list='8.7 8.9 9.0 10.0+PTX 12.0' --build-arg INSTALL_KV_CONNECTORS=true --tag public.ecr.aws/q9t5s3a7/vllm-release-repo:$BUILDKITE_COMMIT-$(uname -m) --target vllm-openai --progress plain -f docker/Dockerfile ."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

As mentioned in the comment for line 11, this hardcoded torch_cuda_arch_list should be replaced with the proposed environment variable $TORCH_CUDA_ARCH_LIST_ARM64 to avoid duplication and improve maintainability.

      - "DOCKER_BUILDKIT=1 docker build --build-arg max_jobs=16 --build-arg USE_SCCACHE=1 --build-arg GIT_REPO_CHECK=1 --build-arg CUDA_VERSION=12.9.1 --build-arg FLASHINFER_AOT_COMPILE=true --build-arg torch_cuda_arch_list=\"$TORCH_CUDA_ARCH_LIST_ARM64\" --build-arg INSTALL_KV_CONNECTORS=true --tag public.ecr.aws/q9t5s3a7/vllm-release-repo:$BUILDKITE_COMMIT-$(uname -m) --target vllm-openai --progress plain -f docker/Dockerfile ."

Copy link
Member

@mgoin mgoin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this makes sense as a PCIe card

@mgoin mgoin enabled auto-merge (squash) October 13, 2025 17:12
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 13, 2025
@mgoin mgoin merged commit f89f599 into vllm-project:main Oct 13, 2025
21 checks passed
@cyb70289 cyb70289 deleted the arm64-release branch October 14, 2025 01:14
1994 pushed a commit to 1994/vllm that referenced this pull request Oct 14, 2025
Dhruvilbhatt pushed a commit to Dhruvilbhatt/vllm that referenced this pull request Oct 14, 2025
bbartels pushed a commit to bbartels/vllm that referenced this pull request Oct 16, 2025
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025
0xrushi pushed a commit to 0xrushi/vllm that referenced this pull request Oct 26, 2025
…ect#26698)

Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
0xrushi pushed a commit to 0xrushi/vllm that referenced this pull request Oct 26, 2025
…ect#26698)

Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/build ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants