[amdgpu] LLVM 20 updates for AMD MI3xx GPUs by tmm77 · Pull Request #8793 · taichi-dev/taichi

tmm77 · 2026-04-15T18:09:10Z

Issue: #

Brief Summary

These code changes update LLVM to version 20 for AMD GPU code generation to enable Taichi on MI300X, MI325X, and MI355X.

Note

High Risk
Touches core JIT/codegen for CPU, CUDA, AMDGPU, and DX12 with a major LLVM API migration; incorrect AMDGPU or pass-manager behavior would break kernel compilation and GPU execution.

Overview
This PR modernizes the Taichi/ROCm stack around LLVM 20 so AMDGPU kernels can be built and JIT-compiled for Instinct MI3xx targets, with supporting packaging and docs.

Compiler & runtime: Clang discovery and tested ceiling move to LLVM/Clang 20 (CMakeLists.txt, CI compiler.py). LLVM setup for AMDGPU builds prefers system/ROCm toolchains (LLVM_DIR, ROCM_PATH, /usr/lib/llvm-20) instead of only downloading prebuilt LLVM 15 zips. Across CPU, CUDA, AMDGPU, and DX12 paths, optimization moves from the legacy pass manager to LLVM’s New Pass Manager (PassBuilder), with version-guarded APIs for codegen opt levels and assembly emission. Opaque pointers and related IR changes touch shared LLVM codegen, struct layout, AMDGPU basic-block insertion, CUDA global loads (replacing removed nvvm.ldg intrinsics with invariant loads), and AMDGPU kernel pointer/addrspace handling. AMDGPU JIT (jit_amdgpu.cpp) is updated for the new pass pipeline and object/HSACO emission.

Language & build: erf / erfc are wired through IR, Python ops, and LLVM codegen (including CUDA). Dockerfile.rocm adds a multi-stage image that installs LLVM 20, applies spdlog_fmt.patch, and builds/installs wheels. setup.py strips non-numeric suffixes from patch versions for AMD packaging.

Tooling & docs: Root README.md is replaced with a deprecation notice (legacy content moved to README-deprecated.md). New ROCm Sphinx docs and readthedocs.yaml describe install/examples. Microbenchmarks default to amdgpu with --arch / --benchmark_plan CLI; Vulkan setup in CI entry is limited to Linux. The PR title presubmit workflow and ci/assets/mitm-ca.crt are removed.

Misc: Debug symbols for AMDGPU (-g in TaichiCore.cmake), ImGui Vulkan init API updates, and a large .wordlist.txt for documentation spelling.

^{Reviewed by Cursor Bugbot for commit 79f75c6. Bugbot is set up for automated code reviews on this repo. Configure here.}

…plans to run

Parameterize microbenchmarks and vulkan sdk update

fix: Patch to avoid the need to fetch source to build Taichi wheel

Taichi Dockerfile

Co-authored-by: Bhavesh Lad <Bhavesh.Lad@amd.com> Co-authored-by: Tiffany Mintz <tiffany.mintz@amd.com>

Merge latest upstream

Merge master updates

Merge latest Updates

…nt changes

…TX handling, and implement new pass manager setup

…14023 from johnnynunez/taichi master branch

from johnnynunez/taichi master branch; some of the changes from these were captured in the previous commit to rocm/taichi

…rom johnnynunez/taichi master branch

Mintz/llvm20 update

Syncing latest release branch with amd-integration branch

cursor · 2026-04-15T18:13:27Z

+    if ((u.system, u.machine) not in (("Linux", "arm64"), ("Linux", "aarch64"))) and not (cmake_args.get_effective("TI_WITH_AMDGPU")):
+        os.environ["LLVM_DIR"] = "/usr/lib/llvm-20/cmake"
+        os.environ["CUDA_HOME"] = "/usr/local/cuda"
+        os.environ["CPATH"] = "/usr/local/cuda/include"


LLVM_DIR hardcoded to Linux path for all platforms

Medium Severity

The final LLVM_DIR assignment unconditionally sets it to /usr/lib/llvm-20/cmake for all non-ARM-Linux, non-AMDGPU platforms, including macOS and Windows. The original code used str(out) which pointed to the platform-specific downloaded LLVM path. This overwrites the correct out-based paths for Darwin and Windows, breaking LLVM discovery on those platforms. Similarly, CUDA_HOME and CPATH are set to Linux-specific paths.

^{Reviewed by Cursor Bugbot for commit f47d1b8. Configure here.}

cursor · 2026-04-15T18:13:27Z

+                      f.read())
+    if not match:
+        raise ValueError("VERSION not found!")
+    version_number = match[1]


Docs conf.py searches for nonexistent CMake function

Medium Severity

The docs/conf.py searches for rocm_setup_version(VERSION ...) in CMakeLists.txt, but the project's CMakeLists.txt does not contain this function call. This causes a ValueError("VERSION not found!") to be raised every time the documentation is built, completely breaking the docs build pipeline.

^{Reviewed by Cursor Bugbot for commit f47d1b8. Configure here.}

This is to address AMD security concerns

cursor

Cursor Bugbot has reviewed your changes and found 3 potential issues.

There are 5 total unresolved issues (including 2 from previous reviews).

^{Reviewed by Cursor Bugbot for commit b9c05cd. Configure here.}

cursor · 2026-06-03T16:49:38Z


 if (TI_WITH_AMDGPU)
-  set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -DTI_WITH_AMDGPU")
+  set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -g -DTI_WITH_AMDGPU")


Typo drops AMDGPU runtime sources

High Severity

With TI_WITH_AMDGPU enabled, taichi/runtime/amdgpu/runtime.cpp is appended to TAIHI_CORE_SOURCE instead of TAICHI_CORE_SOURCE. The core object library is built only from TAICHI_CORE_SOURCE, so that runtime translation unit is never linked into taichi_core.

^{Reviewed by Cursor Bugbot for commit b9c05cd. Configure here.}

cursor · 2026-06-03T16:49:38Z

-        parent_ty = ptr_ty->getPointerElementType();
+      if (auto ptr_ty = llvm::dyn_cast<llvm::PointerType>(parent_ty)) {
+        TI_NOT_IMPLEMENTED;
+      }


Root SNode lookup aborts

High Severity

For root SNodeLookupStmt, when the parent LLVM value comes from a BitCastInst to a pointer type, codegen hits TI_NOT_IMPLEMENTED instead of emitting a GEP. With opaque pointers in LLVM 20, that path is common and kernel compilation fails.

^{Reviewed by Cursor Bugbot for commit b9c05cd. Configure here.}

cursor · 2026-06-03T16:49:38Z

-            url = "https://github.com/GaleSeLee/assets/releases/download/v0.0.5/taichi-llvm-15.0.0-linux.zip"
+            # We should use LLVM toolchains shipped with OS.
+            os.environ["LLVM_DIR"] = os.environ["LLVM_PATH"]+"/lib/cmake"
+            os.environ["CPATH"] = os.environ["ROCM_PATH"]+"/include"


AMDGPU LLVM setup needs env

Medium Severity

On Linux x86_64 with TI_WITH_AMDGPU, setup_llvm sets LLVM_DIR and CPATH from LLVM_PATH and ROCM_PATH without checking they exist, so a missing variable raises KeyError and aborts the build.

^{Reviewed by Cursor Bugbot for commit b9c05cd. Configure here.}

tmm77 and others added 30 commits April 29, 2025 17:14

modifications to microbenchmark suite to run on AMD GPUs

e4a5be3

adding arguments for selecting a list of architectures and benchmark …

e932646

…plans to run

additional modifications for single arch and benchmark plan runs

8a9ca3b

temporarily setting atomic ops repeat to 1

6e4fb08

updating vulkan sdk downlaod url

efe237e

removing comments for saving json files

c3d7b84

Merge pull request #1 from AMD-AI/mintz/parameterize_microbenchmark

eeb3354

Parameterize microbenchmarks and vulkan sdk update

Patch to avoid the need to fetch to build Taichi wheel

bb8a9b3

fix: Patch to avoid the need to fetch source to build Taichi wheel

c137b06

fix: Patch to avoid the need to fetch source to build Taichi wheel

Taicho Multistage Dockerfile

b74c00c

Taichi Multistage Dockerfile

6b0f58b

Taichi Dockerfile

setting architecture to gpu

f791165

ROCm port of taichi

1a6520a

Co-authored-by: Bhavesh Lad <Bhavesh.Lad@amd.com> Co-authored-by: Tiffany Mintz <tiffany.mintz@amd.com>

Merge pull request #3 from taichi-dev/master

86b6184

Merge latest upstream

Merge branch 'amd-develop' into master

9260d4e

Merge pull request #4 from ROCm/master

712d405

Merge master updates

Merge branch 'amd-integration' into amd-develop

46444ee

Merge pull request #5 from ROCm/amd-develop

5eed1b4

Merge latest Updates

LLVM-20

0f2615c

Update LLVM API calls in codegen_cuda.cpp for compatibility with rece…

c189397

…nt changes

Add CHANGELOG.md to document recent updates and improvements

23478fd

Fix include directive for IR analysis header in codegen_cuda.cpp

c5edfdb

Refactor JIT compilation in CUDA: update function pointers, enhance P…

2d4703f

…TX handling, and implement new pass manager setup

Update header includes and fix LLVM API calls in CPU code generation

d2c87f6

Fix header include for program in codegen_cpu.cpp

ad65ec9

cmake build updates, header fixes; Merging from commits ebdc72b to 9d…

1be07f3

…14023 from johnnynunez/taichi master branch

implementing error function and cuda updates; merging 5449f72 to 649c58d

de14f98

from johnnynunez/taichi master branch; some of the changes from these were captured in the previous commit to rocm/taichi

removing updates for blackwell

c984b3c

removing blackwell updates; restoring window_base.cpp include

f5118a7

additional cuda updates for llvm20; merging from 8ca16de to add2df3 f…

78d9213

…rom johnnynunez/taichi master branch

tmm77 and others added 22 commits August 28, 2025 16:04

additional updates for llvm 20

d20c823

fix build issues with llvm 20 update

f0ca790

updated AMD Instinct GPU jit implementation to llvm 20

26ae12c

updating amd gpu kernel code generation to llvm 20

514446e

fix object file type; setting llvm dir based on environment var

2a6adb0

adding bitcode for gfx940,gfx941,gfx942,gfx950

5516360

adding patch for changes to external spdlog

48cc4f7

Merge pull request #6 from ROCm/mintz/llvm20_update

76c25df

Mintz/llvm20 update

updating dockerfile for llvm 20

ed925e6

Update Dockerfile to fix pipeline issues

a78aaca

dockerfile copy dir

2549e39

Dockerfile reformat

300196b

CI: Fix Dockerfile issues

8dab171

Fix Tester Issues

ed1c61d

removing any existing build cache

29c4129

Fix Version Issues

7b155bb

Merge branch 'amd-integration' into release/1.8.0b2

104dc18

Docs: Taichi component, configs and setup for 25.11 release (#2)

13a0550

Merge pull request #8 from ROCm/release/1.8.0b2

39cc7fe

Syncing latest release branch with amd-integration branch

removing rocm_setup_version

36c0aa5

Update taichi-install.rst

7c446fb

removing pull_request.yml for security concerns

f47d1b8

cursor Bot reviewed Apr 15, 2026

View reviewed changes

tmm77 changed the title ~~LLVM 20 updates for AMD MI3xx GPUs~~ [amdgpu] LLVM 20 updates for AMD MI3xx GPUs Apr 16, 2026

Delete ci/assets/mitm-ca.crt

440fcc2

This is to address AMD security concerns

cursor Bot reviewed May 4, 2026

View reviewed changes

Comment thread taichi/runtime/amdgpu/jit_amdgpu.cpp

Rename README.md to README-deprecated.md

b9c05cd

cursor Bot reviewed Jun 3, 2026

View reviewed changes

Create README.md with deprecation notice

79f75c6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[amdgpu] LLVM 20 updates for AMD MI3xx GPUs#8793

[amdgpu] LLVM 20 updates for AMD MI3xx GPUs#8793
tmm77 wants to merge 55 commits into
taichi-dev:masterfrom
ROCm:amd-integration

tmm77 commented Apr 15, 2026 •

edited by cursor Bot

Loading

Uh oh!

Uh oh!

Uh oh!

cursor Bot Apr 15, 2026

Uh oh!

cursor Bot Apr 15, 2026

Uh oh!

Uh oh!

cursor Bot left a comment

Uh oh!

cursor Bot Jun 3, 2026

Uh oh!

cursor Bot Jun 3, 2026

Uh oh!

cursor Bot Jun 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Uh oh!

Conversation

tmm77 commented Apr 15, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Brief Summary

Uh oh!

Uh oh!

Uh oh!

cursor Bot Apr 15, 2026

Choose a reason for hiding this comment

LLVM_DIR hardcoded to Linux path for all platforms

Uh oh!

cursor Bot Apr 15, 2026

Choose a reason for hiding this comment

Docs conf.py searches for nonexistent CMake function

Uh oh!

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot Jun 3, 2026

Choose a reason for hiding this comment

Typo drops AMDGPU runtime sources

Uh oh!

cursor Bot Jun 3, 2026

Choose a reason for hiding this comment

Root SNode lookup aborts

Uh oh!

cursor Bot Jun 3, 2026

Choose a reason for hiding this comment

AMDGPU LLVM setup needs env

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

tmm77 commented Apr 15, 2026 •

edited by cursor Bot

Loading