-
Notifications
You must be signed in to change notification settings - Fork 15.1k
[NVPTX] Add pm_event intrinsics #141278
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[NVPTX] Add pm_event intrinsics #141278
Conversation
|
@llvm/pr-subscribers-clang @llvm/pr-subscribers-backend-nvptx Author: Durgadoss R (durga4github) ChangesThis patch adds both Full diff: https://github.com/llvm/llvm-project/pull/141278.diff 5 Files Affected:
diff --git a/llvm/docs/NVPTXUsage.rst b/llvm/docs/NVPTXUsage.rst
index 8bb0f2ed17c32..9d108a98210b0 100644
--- a/llvm/docs/NVPTXUsage.rst
+++ b/llvm/docs/NVPTXUsage.rst
@@ -1868,6 +1868,38 @@ If the request failed, the behavior of these intrinsics is undefined.
For more information, refer `PTX ISA <https://docs.nvidia.com/cuda/parallel-thread-execution/?a#parallel-synchronization-and-communication-instructions-clusterlaunchcontrol-query-cancel>`__.
+Perf Monitor Event Intrinsics
+-----------------------------
+
+'``llvm.nvvm.pm.event.[idx|mask]``' Intrinsics
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+.. code-block:: llvm
+
+ declare void @llvm.nvvm.pm.event.idx(i32 immarg %idx_val)
+ declare void @llvm.nvvm.pm.event.mask(i16 immarg %mask_val)
+
+Overview:
+"""""""""
+
+The '``llvm.nvvm.pm.event.*``' intrinsics trigger one or more
+performance monitor events.
+
+The ``idx`` variant triggers a single performance monitor event
+indexed by the immediate operand ``%idx_val`` in the range
+[0, 16). When the ``%idx_val`` is not within the range, it may
+raise an error from the verifier.
+
+The ``mask`` variant triggers one or more of the performance
+monitor events. Each bit in the 16-bit immediate operand
+``%mask_val`` controls an event.
+
+For more information on the pmevent instructions, refer to the PTX ISA
+`<https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#miscellaneous-instructions-pmevent>`_.
+
Other Intrinsics
----------------
diff --git a/llvm/include/llvm/IR/IntrinsicsNVVM.td b/llvm/include/llvm/IR/IntrinsicsNVVM.td
index 91e7d188c8533..826b87818fa60 100644
--- a/llvm/include/llvm/IR/IntrinsicsNVVM.td
+++ b/llvm/include/llvm/IR/IntrinsicsNVVM.td
@@ -768,6 +768,18 @@ let TargetPrefix = "nvvm" in {
DefaultAttrsIntrinsic<[], [llvm_i32_ty],
[IntrConvergent, IntrNoMem, IntrHasSideEffects]>;
+ // Performance Monitor Events (pm events) intrinsics
+ // The imm-argument to the _idx variant must be
+ // within the range [0, 16).
+ def int_nvvm_pm_event_idx : NVVMBuiltin,
+ DefaultAttrsIntrinsic<[], [llvm_i32_ty],
+ [IntrConvergent, IntrNoMem, IntrHasSideEffects,
+ ImmArg<ArgIndex<0>>, Range<ArgIndex<0>, 0, 16>]>;
+ def int_nvvm_pm_event_mask : NVVMBuiltin,
+ DefaultAttrsIntrinsic<[], [llvm_i16_ty],
+ [IntrConvergent, IntrNoMem, IntrHasSideEffects,
+ ImmArg<ArgIndex<0>>]>;
+
//
// Min Max
//
diff --git a/llvm/lib/Target/NVPTX/NVPTXIntrinsics.td b/llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
index 8fb5884fa2a20..2e0e52c2a5a1f 100644
--- a/llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
+++ b/llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
@@ -7172,6 +7172,17 @@ defm INT_SET_MAXNREG_DEC : SET_MAXNREG<"dec", int_nvvm_setmaxnreg_dec_sync_align
} // isConvergent
+let hasSideEffects = 1 in {
+// Performance Monitor events
+def INT_PM_EVENT_IDX : NVPTXInst<(outs), (ins i32imm:$idx),
+ "pmevent $idx;",
+ [(int_nvvm_pm_event_idx timm:$idx)]>;
+def INT_PM_EVENT_MASK : NVPTXInst<(outs), (ins i16imm:$mask),
+ "pmevent.mask $mask;",
+ [(int_nvvm_pm_event_mask timm:$mask)]>,
+ Requires<[hasSM<20>, hasPTX<30>]>;
+} // hasSideEffects
+
//
// WGMMA fence instructions
//
diff --git a/llvm/test/CodeGen/NVPTX/pm-event-invalid.ll b/llvm/test/CodeGen/NVPTX/pm-event-invalid.ll
new file mode 100644
index 0000000000000..11b333422a49f
--- /dev/null
+++ b/llvm/test/CodeGen/NVPTX/pm-event-invalid.ll
@@ -0,0 +1,10 @@
+; RUN: not llc < %s -mtriple=nvptx64 -mcpu=sm_20 -o /dev/null 2>&1 | FileCheck %s
+
+declare void @llvm.nvvm.pm.event.idx(i32 immarg %idx)
+
+define void @test_invalid_pm_event() {
+ ; CHECK: immarg value 16 out of range [0, 16)
+ call void @llvm.nvvm.pm.event.idx(i32 16)
+
+ ret void
+}
diff --git a/llvm/test/CodeGen/NVPTX/pm-event.ll b/llvm/test/CodeGen/NVPTX/pm-event.ll
new file mode 100644
index 0000000000000..83ed450c77b3d
--- /dev/null
+++ b/llvm/test/CodeGen/NVPTX/pm-event.ll
@@ -0,0 +1,16 @@
+; RUN: llc < %s -mtriple=nvptx64 -mcpu=sm_20 | FileCheck %s
+; RUN: %if ptxas %{ llc < %s -mtriple=nvptx64 -mcpu=sm_20 | %ptxas-verify %}
+
+declare void @llvm.nvvm.pm.event.idx(i32 %idx)
+declare void @llvm.nvvm.pm.event.mask(i16 %mask)
+
+; CHECK-LABEL: test_pm_event
+define void @test_pm_event() {
+ ; CHECK: pmevent 15;
+ call void @llvm.nvvm.pm.event.idx(i32 15)
+
+ ; CHECK: pmevent.mask 255;
+ call void @llvm.nvvm.pm.event.mask(i16 u0xff)
+
+ ret void
+}
|
fdbeddd to
0fc21a1
Compare
AlexMaclean
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Artem-B
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Builtin signature needs a fix, but LGTM otherwise.
This patch adds pm_event.mask intrinsic and its clang-builtin. Signed-off-by: Durgadoss R <durgadossr@nvidia.com>
0fc21a1 to
ba3a948
Compare
This patch adds the pm_event.mask intrinsic and its clang-builtin. Signed-off-by: Durgadoss R <durgadossr@nvidia.com>
This patch adds the pm_event.mask intrinsic and its
clang-builtin.