Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AMDGPU] Clarify amdgpu.cs.chain + init whole wave. NFC #115452

Merged
merged 1 commit into from
Nov 14, 2024

Conversation

rovka
Copy link
Collaborator

@rovka rovka commented Nov 8, 2024

Add some docs clarifying how inactive lanes are handled in the amdgpu_cs_chain calling convention when the llvm.amdgcn.init.whole.wave intrinsic is used.

Add some docs clarifying how inactive lanes are handled in the
amdgpu_cs_chain calling convention when the llvm.amdgcn.init.whole.wave
intrinsic is used.
@llvmbot
Copy link
Member

llvmbot commented Nov 8, 2024

@llvm/pr-subscribers-backend-amdgpu

Author: Diana Picus (rovka)

Changes

Add some docs clarifying how inactive lanes are handled in the amdgpu_cs_chain calling convention when the llvm.amdgcn.init.whole.wave intrinsic is used.


Full diff: https://github.com/llvm/llvm-project/pull/115452.diff

1 Files Affected:

  • (modified) llvm/docs/AMDGPUUsage.rst (+4-1)
diff --git a/llvm/docs/AMDGPUUsage.rst b/llvm/docs/AMDGPUUsage.rst
index 5b83ea428c0bff..b36542ac220e65 100644
--- a/llvm/docs/AMDGPUUsage.rst
+++ b/llvm/docs/AMDGPUUsage.rst
@@ -1701,7 +1701,10 @@ The AMDGPU backend supports the following calling conventions:
 
                                      Values in scalar registers as well as v0-v7 are not preserved. Values in
                                      VGPRs starting at v8 are not preserved for the active lanes, but must be
-                                     saved by the callee for inactive lanes when using WWM.
+                                     saved by the callee for inactive lanes when using WWM (a notable exception is
+                                     when the llvm.amdgcn.init.whole.wave intrinsic is used in the function - in this
+                                     case the backend assumes that there are no inactive lanes upon entry; any inactive
+                                     lanes that need to be preserved must be explicitly present in the IR).
 
                                      Wave scratch is "empty" at function boundaries. There is no stack pointer input
                                      or output value, but functions are free to use scratch starting from an initial

@rovka rovka merged commit 2aa6ced into llvm:main Nov 14, 2024
11 checks passed
@rovka rovka deleted the cs-chain-iww-docs branch November 14, 2024 09:10
akshayrdeodhar pushed a commit to akshayrdeodhar/llvm-project that referenced this pull request Nov 18, 2024
Add some docs clarifying how inactive lanes are handled in the
amdgpu_cs_chain calling convention when the llvm.amdgcn.init.whole.wave
intrinsic is used.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants