-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[NVPTX] Add SM versions for 101 and 120 #124155
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[NVPTX] Add SM versions for 101 and 120 #124155
Conversation
This patch adds SM and PTX versions for SM 101, 120 and their arch-accelerated variants. All these are supported in cuda-12.8. sm120/a requires ptx8.7 and the rest require ptx8.6. Signed-off-by: Durgadoss R <durgadossr@nvidia.com>
@llvm/pr-subscribers-backend-nvptx Author: Durgadoss R (durga4github) ChangesThis patch adds SM and PTX versions for SM All these are supported in cuda-12.8. Full diff: https://github.com/llvm/llvm-project/pull/124155.diff 3 Files Affected:
diff --git a/llvm/lib/Target/NVPTX/NVPTX.td b/llvm/lib/Target/NVPTX/NVPTX.td
index 3ca8b4d294079c..5467ae011a2081 100644
--- a/llvm/lib/Target/NVPTX/NVPTX.td
+++ b/llvm/lib/Target/NVPTX/NVPTX.td
@@ -35,15 +35,18 @@ class FeaturePTX<int version>:
"Use PTX version " # version>;
foreach sm = [20, 21, 30, 32, 35, 37, 50, 52, 53,
- 60, 61, 62, 70, 72, 75, 80, 86, 87, 89, 90, 100] in
+ 60, 61, 62, 70, 72, 75, 80, 86, 87,
+ 89, 90, 100, 101, 120] in
def SM#sm: FeatureSM<""#sm, !mul(sm, 10)>;
def SM90a: FeatureSM<"90a", 901>;
def SM100a: FeatureSM<"100a", 1001>;
+def SM101a: FeatureSM<"101a", 1011>;
+def SM120a: FeatureSM<"120a", 1201>;
foreach version = [32, 40, 41, 42, 43, 50, 60, 61, 62, 63, 64, 65,
70, 71, 72, 73, 74, 75, 76, 77, 78,
- 80, 81, 82, 83, 84, 85, 86] in
+ 80, 81, 82, 83, 84, 85, 86, 87] in
def PTX#version: FeaturePTX<version>;
//===----------------------------------------------------------------------===//
@@ -76,6 +79,10 @@ def : Proc<"sm_90", [SM90, PTX78]>;
def : Proc<"sm_90a", [SM90a, PTX80]>;
def : Proc<"sm_100", [SM100, PTX86]>;
def : Proc<"sm_100a", [SM100a, PTX86]>;
+def : Proc<"sm_101", [SM101, PTX86]>;
+def : Proc<"sm_101a", [SM101a, PTX86]>;
+def : Proc<"sm_120", [SM120, PTX87]>;
+def : Proc<"sm_120a", [SM120a, PTX87]>;
def NVPTXInstrInfo : InstrInfo {
}
diff --git a/llvm/lib/Target/NVPTX/NVPTXInstrInfo.td b/llvm/lib/Target/NVPTX/NVPTXInstrInfo.td
index a076fde8ee7676..f17799c1300153 100644
--- a/llvm/lib/Target/NVPTX/NVPTXInstrInfo.td
+++ b/llvm/lib/Target/NVPTX/NVPTXInstrInfo.td
@@ -172,6 +172,9 @@ class hasSM<int version>: Predicate<"Subtarget->getSmVersion() >= " # version>;
// Explicit records for arch-accelerated SM versions
def hasSM90a : Predicate<"Subtarget->getFullSmVersion() == 901">;
+def hasSM100a : Predicate<"Subtarget->getFullSmVersion() == 1001">;
+def hasSM101a : Predicate<"Subtarget->getFullSmVersion() == 1011">;
+def hasSM120a : Predicate<"Subtarget->getFullSmVersion() == 1201">;
// non-sync shfl instructions are not available on sm_70+ in PTX6.4+
def hasSHFL : Predicate<"!(Subtarget->getSmVersion() >= 70"
diff --git a/llvm/test/CodeGen/NVPTX/sm-version.ll b/llvm/test/CodeGen/NVPTX/sm-version.ll
index 0e37d6e4b0d87f..ce9a1b1b161dce 100644
--- a/llvm/test/CodeGen/NVPTX/sm-version.ll
+++ b/llvm/test/CodeGen/NVPTX/sm-version.ll
@@ -16,6 +16,12 @@
; RUN: llc < %s -mtriple=nvptx -mcpu=sm_86 | FileCheck %s --check-prefix=SM86
; RUN: llc < %s -mtriple=nvptx -mcpu=sm_90 | FileCheck %s --check-prefix=SM90
; RUN: llc < %s -mtriple=nvptx -mcpu=sm_90a | FileCheck %s --check-prefix=SM90a
+; RUN: llc < %s -mtriple=nvptx -mcpu=sm_100 | FileCheck %s --check-prefix=SM100
+; RUN: llc < %s -mtriple=nvptx -mcpu=sm_100a | FileCheck %s --check-prefix=SM100a
+; RUN: llc < %s -mtriple=nvptx -mcpu=sm_101 | FileCheck %s --check-prefix=SM101
+; RUN: llc < %s -mtriple=nvptx -mcpu=sm_101a | FileCheck %s --check-prefix=SM101a
+; RUN: llc < %s -mtriple=nvptx -mcpu=sm_120 | FileCheck %s --check-prefix=SM120
+; RUN: llc < %s -mtriple=nvptx -mcpu=sm_120a | FileCheck %s --check-prefix=SM120a
; RUN: llc < %s -mtriple=nvptx64 -mcpu=sm_20 | FileCheck %s --check-prefix=SM20
; RUN: llc < %s -mtriple=nvptx64 -mcpu=sm_21 | FileCheck %s --check-prefix=SM21
@@ -35,6 +41,12 @@
; RUN: llc < %s -mtriple=nvptx64 -mcpu=sm_86 | FileCheck %s --check-prefix=SM86
; RUN: llc < %s -mtriple=nvptx64 -mcpu=sm_90 | FileCheck %s --check-prefix=SM90
; RUN: llc < %s -mtriple=nvptx64 -mcpu=sm_90a | FileCheck %s --check-prefix=SM90a
+; RUN: llc < %s -mtriple=nvptx64 -mcpu=sm_100 | FileCheck %s --check-prefix=SM100
+; RUN: llc < %s -mtriple=nvptx64 -mcpu=sm_100a | FileCheck %s --check-prefix=SM100a
+; RUN: llc < %s -mtriple=nvptx64 -mcpu=sm_101 | FileCheck %s --check-prefix=SM101
+; RUN: llc < %s -mtriple=nvptx64 -mcpu=sm_101a | FileCheck %s --check-prefix=SM101a
+; RUN: llc < %s -mtriple=nvptx64 -mcpu=sm_120 | FileCheck %s --check-prefix=SM120
+; RUN: llc < %s -mtriple=nvptx64 -mcpu=sm_120a | FileCheck %s --check-prefix=SM120a
; SM20: .version 3.2
; SM21: .version 3.2
@@ -54,6 +66,12 @@
; SM86: .version 7.1
; SM90: .version 7.8
; SM90a: .version 8.0
+; SM100: .version 8.6
+; SM100a: .version 8.6
+; SM101: .version 8.6
+; SM101a: .version 8.6
+; SM120: .version 8.7
+; SM120a: .version 8.7
; SM20: .target sm_20
; SM21: .target sm_21
@@ -73,3 +91,9 @@
; SM86: .target sm_86
; SM90: .target sm_90
; SM90a: .target sm_90a
+; SM100: .target sm_100
+; SM100a: .target sm_100a
+; SM101: .target sm_101
+; SM101a: .target sm_101a
+; SM120: .target sm_120
+; SM120a: .target sm_120a
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/50/builds/9352 Here is the relevant piece of the build log for the reference
|
Imported from GitHub PR #21822 Created `ShouldUsePtxExtension` helper for the extension suffix (this will also be used for sm120, etc). CUDA 12.8 was recently released, which supports PTX 8.7, but that is not supported by the integrated LLVM (support added in llvm/llvm-project#124155), so leaving the association with PTX 8.6 - this doesn't raise warnings during compilation. Copybara import of the project: -- 267cf74 by Sergey Kozub <skozub@nvidia.com>: Add support for SM100a architecture (Blackwell) Merging this change closes #21822 FUTURE_COPYBARA_INTEGRATE_REVIEW=#21822 from openxla:devel/sm100a 267cf74 PiperOrigin-RevId: 720655796
Imported from GitHub PR openxla/xla#21822 Created `ShouldUsePtxExtension` helper for the extension suffix (this will also be used for sm120, etc). CUDA 12.8 was recently released, which supports PTX 8.7, but that is not supported by the integrated LLVM (support added in llvm/llvm-project#124155), so leaving the association with PTX 8.6 - this doesn't raise warnings during compilation. Copybara import of the project: -- 267cf74a084c933e532a622da2485befdc47f8ce by Sergey Kozub <skozub@nvidia.com>: Add support for SM100a architecture (Blackwell) Merging this change closes #21822 FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#21822 from openxla:devel/sm100a 267cf74a084c933e532a622da2485befdc47f8ce PiperOrigin-RevId: 720655796
Imported from GitHub PR #21822 Created `ShouldUsePtxExtension` helper for the extension suffix (this will also be used for sm120, etc). CUDA 12.8 was recently released, which supports PTX 8.7, but that is not supported by the integrated LLVM (support added in llvm/llvm-project#124155), so leaving the association with PTX 8.6 - this doesn't raise warnings during compilation. Copybara import of the project: -- 267cf74 by Sergey Kozub <skozub@nvidia.com>: Add support for SM100a architecture (Blackwell) Merging this change closes #21822 FUTURE_COPYBARA_INTEGRATE_REVIEW=#21822 from openxla:devel/sm100a 267cf74 PiperOrigin-RevId: 720655796
Imported from GitHub PR openxla/xla#21822 Created `ShouldUsePtxExtension` helper for the extension suffix (this will also be used for sm120, etc). CUDA 12.8 was recently released, which supports PTX 8.7, but that is not supported by the integrated LLVM (support added in llvm/llvm-project#124155), so leaving the association with PTX 8.6 - this doesn't raise warnings during compilation. Copybara import of the project: -- 267cf74a084c933e532a622da2485befdc47f8ce by Sergey Kozub <skozub@nvidia.com>: Add support for SM100a architecture (Blackwell) Merging this change closes #21822 FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#21822 from openxla:devel/sm100a 267cf74a084c933e532a622da2485befdc47f8ce PiperOrigin-RevId: 720655796
Imported from GitHub PR #21822 Created `ShouldUsePtxExtension` helper for the extension suffix (this will also be used for sm120, etc). CUDA 12.8 was recently released, which supports PTX 8.7, but that is not supported by the integrated LLVM (support added in llvm/llvm-project#124155), so leaving the association with PTX 8.6 - this doesn't raise warnings during compilation. Copybara import of the project: -- 267cf74 by Sergey Kozub <skozub@nvidia.com>: Add support for SM100a architecture (Blackwell) Merging this change closes #21822 FUTURE_COPYBARA_INTEGRATE_REVIEW=#21822 from openxla:devel/sm100a 267cf74 PiperOrigin-RevId: 720655796
Imported from GitHub PR #21822 Created `ShouldUsePtxExtension` helper for the extension suffix (this will also be used for sm120, etc). CUDA 12.8 was recently released, which supports PTX 8.7, but that is not supported by the integrated LLVM (support added in llvm/llvm-project#124155), so leaving the association with PTX 8.6 - this doesn't raise warnings during compilation. Copybara import of the project: -- 267cf74 by Sergey Kozub <skozub@nvidia.com>: Add support for SM100a architecture (Blackwell) Merging this change closes #21822 COPYBARA_INTEGRATE_REVIEW=#21822 from openxla:devel/sm100a 267cf74 PiperOrigin-RevId: 720806648
Imported from GitHub PR openxla/xla#21822 Created `ShouldUsePtxExtension` helper for the extension suffix (this will also be used for sm120, etc). CUDA 12.8 was recently released, which supports PTX 8.7, but that is not supported by the integrated LLVM (support added in llvm/llvm-project#124155), so leaving the association with PTX 8.6 - this doesn't raise warnings during compilation. Copybara import of the project: -- 267cf74a084c933e532a622da2485befdc47f8ce by Sergey Kozub <skozub@nvidia.com>: Add support for SM100a architecture (Blackwell) Merging this change closes #21822 PiperOrigin-RevId: 720806648
…(Blackwell) Imported from GitHub PR #22029 In addition to SM120a, also add SM101a mentioned in the PTX 8.7 spec (https://docs.nvidia.com/cuda/parallel-thread-execution/#release-notes), which is a slight variation of SM100a. Bumping the max supported PTX version to 8.7, as the LLVM PR (llvm/llvm-project#124155) adding the support is now integrated to OpenXLA. Copybara import of the project: -- be59b7a by Sergey Kozub <skozub@nvidia.com>: [XLA:GPU] Add support for SM101a and SM120a architectures (Blackwell) Merging this change closes #22029 FUTURE_COPYBARA_INTEGRATE_REVIEW=#22029 from openxla:devel/sm120a be59b7a PiperOrigin-RevId: 721049239
…(Blackwell) Imported from GitHub PR #22029 In addition to SM120a, also add SM101a mentioned in the PTX 8.7 spec (https://docs.nvidia.com/cuda/parallel-thread-execution/#release-notes), which is a slight variation of SM100a. Bumping the max supported PTX version to 8.7, as the LLVM PR (llvm/llvm-project#124155) adding the support is now integrated to OpenXLA. Copybara import of the project: -- be59b7a by Sergey Kozub <skozub@nvidia.com>: [XLA:GPU] Add support for SM101a and SM120a architectures (Blackwell) Merging this change closes #22029 FUTURE_COPYBARA_INTEGRATE_REVIEW=#22029 from openxla:devel/sm120a be59b7a PiperOrigin-RevId: 721049239
…(Blackwell) Imported from GitHub PR openxla/xla#22029 In addition to SM120a, also add SM101a mentioned in the PTX 8.7 spec (https://docs.nvidia.com/cuda/parallel-thread-execution/#release-notes), which is a slight variation of SM100a. Bumping the max supported PTX version to 8.7, as the LLVM PR (llvm/llvm-project#124155) adding the support is now integrated to OpenXLA. Copybara import of the project: -- be59b7a51721637d880207e7adb69a18c3a92bea by Sergey Kozub <skozub@nvidia.com>: [XLA:GPU] Add support for SM101a and SM120a architectures (Blackwell) Merging this change closes #22029 FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#22029 from openxla:devel/sm120a be59b7a51721637d880207e7adb69a18c3a92bea PiperOrigin-RevId: 721049239
…(Blackwell) Imported from GitHub PR #22029 In addition to SM120a, also add SM101a mentioned in the PTX 8.7 spec (https://docs.nvidia.com/cuda/parallel-thread-execution/#release-notes), which is a slight variation of SM100a. Bumping the max supported PTX version to 8.7, as the LLVM PR (llvm/llvm-project#124155) adding the support is now integrated to OpenXLA. Copybara import of the project: -- be59b7a by Sergey Kozub <skozub@nvidia.com>: [XLA:GPU] Add support for SM101a and SM120a architectures (Blackwell) Merging this change closes #22029 FUTURE_COPYBARA_INTEGRATE_REVIEW=#22029 from openxla:devel/sm120a be59b7a PiperOrigin-RevId: 721049239
…(Blackwell) Imported from GitHub PR openxla/xla#22029 In addition to SM120a, also add SM101a mentioned in the PTX 8.7 spec (https://docs.nvidia.com/cuda/parallel-thread-execution/#release-notes), which is a slight variation of SM100a. Bumping the max supported PTX version to 8.7, as the LLVM PR (llvm/llvm-project#124155) adding the support is now integrated to OpenXLA. Copybara import of the project: -- be59b7a51721637d880207e7adb69a18c3a92bea by Sergey Kozub <skozub@nvidia.com>: [XLA:GPU] Add support for SM101a and SM120a architectures (Blackwell) Merging this change closes #22029 FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#22029 from openxla:devel/sm120a be59b7a51721637d880207e7adb69a18c3a92bea PiperOrigin-RevId: 721049239
…(Blackwell) Imported from GitHub PR #22029 In addition to SM120a, also add SM101a mentioned in the PTX 8.7 spec (https://docs.nvidia.com/cuda/parallel-thread-execution/#release-notes), which is a slight variation of SM100a. Bumping the max supported PTX version to 8.7, as the LLVM PR (llvm/llvm-project#124155) adding the support is now integrated to OpenXLA. Copybara import of the project: -- be59b7a by Sergey Kozub <skozub@nvidia.com>: [XLA:GPU] Add support for SM101a and SM120a architectures (Blackwell) Merging this change closes #22029 COPYBARA_INTEGRATE_REVIEW=#22029 from openxla:devel/sm120a be59b7a PiperOrigin-RevId: 721088886
…(Blackwell) Imported from GitHub PR openxla/xla#22029 In addition to SM120a, also add SM101a mentioned in the PTX 8.7 spec (https://docs.nvidia.com/cuda/parallel-thread-execution/#release-notes), which is a slight variation of SM100a. Bumping the max supported PTX version to 8.7, as the LLVM PR (llvm/llvm-project#124155) adding the support is now integrated to OpenXLA. Copybara import of the project: -- be59b7a51721637d880207e7adb69a18c3a92bea by Sergey Kozub <skozub@nvidia.com>: [XLA:GPU] Add support for SM101a and SM120a architectures (Blackwell) Merging this change closes #22029 PiperOrigin-RevId: 721088886
This patch adds SM and PTX versions for SM
101, 120 and their arch-accelerated variants.
All these are supported in cuda-12.8.
sm120/120a requires ptx8.7 and the rest require ptx8.6.