[amdgpu] LLVM 20 updates for AMD MI3xx GPUs#8793
Conversation
Parameterize microbenchmarks and vulkan sdk update
fix: Patch to avoid the need to fetch source to build Taichi wheel
Taichi Dockerfile
Co-authored-by: Bhavesh Lad <Bhavesh.Lad@amd.com> Co-authored-by: Tiffany Mintz <tiffany.mintz@amd.com>
Merge latest upstream
Merge master updates
Merge latest Updates
…TX handling, and implement new pass manager setup
Mintz/llvm20 update
Syncing latest release branch with amd-integration branch
| if ((u.system, u.machine) not in (("Linux", "arm64"), ("Linux", "aarch64"))) and not (cmake_args.get_effective("TI_WITH_AMDGPU")): | ||
| os.environ["LLVM_DIR"] = "/usr/lib/llvm-20/cmake" | ||
| os.environ["CUDA_HOME"] = "/usr/local/cuda" | ||
| os.environ["CPATH"] = "/usr/local/cuda/include" |
There was a problem hiding this comment.
LLVM_DIR hardcoded to Linux path for all platforms
Medium Severity
The final LLVM_DIR assignment unconditionally sets it to /usr/lib/llvm-20/cmake for all non-ARM-Linux, non-AMDGPU platforms, including macOS and Windows. The original code used str(out) which pointed to the platform-specific downloaded LLVM path. This overwrites the correct out-based paths for Darwin and Windows, breaking LLVM discovery on those platforms. Similarly, CUDA_HOME and CPATH are set to Linux-specific paths.
Reviewed by Cursor Bugbot for commit f47d1b8. Configure here.
| f.read()) | ||
| if not match: | ||
| raise ValueError("VERSION not found!") | ||
| version_number = match[1] |
There was a problem hiding this comment.
Docs conf.py searches for nonexistent CMake function
Medium Severity
The docs/conf.py searches for rocm_setup_version(VERSION ...) in CMakeLists.txt, but the project's CMakeLists.txt does not contain this function call. This causes a ValueError("VERSION not found!") to be raised every time the documentation is built, completely breaking the docs build pipeline.
Reviewed by Cursor Bugbot for commit f47d1b8. Configure here.
This is to address AMD security concerns
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 3 potential issues.
There are 5 total unresolved issues (including 2 from previous reviews).
Reviewed by Cursor Bugbot for commit b9c05cd. Configure here.
|
|
||
| if (TI_WITH_AMDGPU) | ||
| set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -DTI_WITH_AMDGPU") | ||
| set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -g -DTI_WITH_AMDGPU") |
There was a problem hiding this comment.
Typo drops AMDGPU runtime sources
High Severity
With TI_WITH_AMDGPU enabled, taichi/runtime/amdgpu/runtime.cpp is appended to TAIHI_CORE_SOURCE instead of TAICHI_CORE_SOURCE. The core object library is built only from TAICHI_CORE_SOURCE, so that runtime translation unit is never linked into taichi_core.
Reviewed by Cursor Bugbot for commit b9c05cd. Configure here.
| parent_ty = ptr_ty->getPointerElementType(); | ||
| if (auto ptr_ty = llvm::dyn_cast<llvm::PointerType>(parent_ty)) { | ||
| TI_NOT_IMPLEMENTED; | ||
| } |
There was a problem hiding this comment.
Root SNode lookup aborts
High Severity
For root SNodeLookupStmt, when the parent LLVM value comes from a BitCastInst to a pointer type, codegen hits TI_NOT_IMPLEMENTED instead of emitting a GEP. With opaque pointers in LLVM 20, that path is common and kernel compilation fails.
Reviewed by Cursor Bugbot for commit b9c05cd. Configure here.
| url = "https://github.com/GaleSeLee/assets/releases/download/v0.0.5/taichi-llvm-15.0.0-linux.zip" | ||
| # We should use LLVM toolchains shipped with OS. | ||
| os.environ["LLVM_DIR"] = os.environ["LLVM_PATH"]+"/lib/cmake" | ||
| os.environ["CPATH"] = os.environ["ROCM_PATH"]+"/include" |
There was a problem hiding this comment.
AMDGPU LLVM setup needs env
Medium Severity
On Linux x86_64 with TI_WITH_AMDGPU, setup_llvm sets LLVM_DIR and CPATH from LLVM_PATH and ROCM_PATH without checking they exist, so a missing variable raises KeyError and aborts the build.
Reviewed by Cursor Bugbot for commit b9c05cd. Configure here.


Issue: #
Brief Summary
These code changes update LLVM to version 20 for AMD GPU code generation to enable Taichi on MI300X, MI325X, and MI355X.
Note
High Risk
Touches core JIT/codegen for CPU, CUDA, AMDGPU, and DX12 with a major LLVM API migration; incorrect AMDGPU or pass-manager behavior would break kernel compilation and GPU execution.
Overview
This PR modernizes the Taichi/ROCm stack around LLVM 20 so AMDGPU kernels can be built and JIT-compiled for Instinct MI3xx targets, with supporting packaging and docs.
Compiler & runtime: Clang discovery and tested ceiling move to LLVM/Clang 20 (
CMakeLists.txt, CIcompiler.py). LLVM setup for AMDGPU builds prefers system/ROCm toolchains (LLVM_DIR,ROCM_PATH,/usr/lib/llvm-20) instead of only downloading prebuilt LLVM 15 zips. Across CPU, CUDA, AMDGPU, and DX12 paths, optimization moves from the legacy pass manager to LLVM’s New Pass Manager (PassBuilder), with version-guarded APIs for codegen opt levels and assembly emission. Opaque pointers and related IR changes touch shared LLVM codegen, struct layout, AMDGPU basic-block insertion, CUDA global loads (replacing removednvvm.ldgintrinsics with invariant loads), and AMDGPU kernel pointer/addrspace handling. AMDGPU JIT (jit_amdgpu.cpp) is updated for the new pass pipeline and object/HSACO emission.Language & build:
erf/erfcare wired through IR, Python ops, and LLVM codegen (including CUDA).Dockerfile.rocmadds a multi-stage image that installs LLVM 20, appliesspdlog_fmt.patch, and builds/installs wheels.setup.pystrips non-numeric suffixes from patch versions for AMD packaging.Tooling & docs: Root
README.mdis replaced with a deprecation notice (legacy content moved toREADME-deprecated.md). New ROCm Sphinx docs andreadthedocs.yamldescribe install/examples. Microbenchmarks default to amdgpu with--arch/--benchmark_planCLI; Vulkan setup in CI entry is limited to Linux. The PR title presubmit workflow andci/assets/mitm-ca.crtare removed.Misc: Debug symbols for AMDGPU (
-ginTaichiCore.cmake), ImGui Vulkan init API updates, and a large.wordlist.txtfor documentation spelling.Reviewed by Cursor Bugbot for commit 79f75c6. Bugbot is set up for automated code reviews on this repo. Configure here.