Skip to content

Commit 231fa3f

Browse files
authored
Port Local Accessor and Global Offset passes to the new PM (#5987)
This patch ports Local Accessor and Global Offset passes to the new pass manager. As most of other passes in both AMDGPU and NVPTX backends are still running with the legacy PM, it provides a legacy struct that wraps around the new PM and lets the old interface be used. Fixes: #5310
1 parent 3c1d342 commit 231fa3f

File tree

7 files changed

+670
-579
lines changed

7 files changed

+670
-579
lines changed

llvm/include/llvm/SYCLLowerIR/GlobalOffset.h

Lines changed: 106 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -5,21 +5,119 @@
55
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
66
//
77
//===----------------------------------------------------------------------===//
8-
//
9-
// This pass operates on SYCL kernels being compiled to CUDA. It looks for uses
10-
// of the `llvm.nvvm.implicit.offset` intrinsic and replaces it with a offset
11-
// parameter which will be threaded through from the kernel entry point.
12-
//
13-
//===----------------------------------------------------------------------===//
148

159
#ifndef LLVM_SYCL_GLOBALOFFSET_H
1610
#define LLVM_SYCL_GLOBALOFFSET_H
1711

18-
#include "llvm/Pass.h"
12+
#include "llvm/IR/Module.h"
13+
#include "llvm/IR/PassManager.h"
14+
#include "llvm/SYCLLowerIR/TargetHelpers.h"
1915

2016
namespace llvm {
2117

22-
ModulePass *createGlobalOffsetPass();
18+
class ModulePass;
19+
class PassRegistry;
20+
21+
/// This pass operates on SYCL kernels that target AMDGPU or NVVM. It looks for
22+
/// uses of the `llvm.{amdgcn|nvvm}.implicit.offset` intrinsic and replaces it
23+
/// with an offset parameter which will be threaded through from the kernel
24+
/// entry point.
25+
class GlobalOffsetPass : public PassInfoMixin<GlobalOffsetPass> {
26+
private:
27+
using KernelPayload = TargetHelpers::KernelPayload;
28+
using ArchType = TargetHelpers::ArchType;
29+
30+
public:
31+
explicit GlobalOffsetPass() {}
32+
33+
PreservedAnalyses run(Module &M, ModuleAnalysisManager &);
34+
static StringRef getPassName() { return "Add implicit SYCL global offset"; }
35+
36+
private:
37+
/// After the execution of this function, the module to which the kernel
38+
/// `Func` belongs, contains both the original function and its clone with the
39+
/// signature extended with the implicit offset parameter and `_with_offset`
40+
/// appended to the name.
41+
/// An alloca of 3 zeros (corresponding to offsets in x, y and z) is added to
42+
/// the original kernel, in order to keep the interface of kernel's call
43+
/// graph unified, regardless of the fact if the global offset has been used.
44+
///
45+
/// \param Func Kernel to be processed.
46+
void processKernelEntryPoint(Function *Func);
47+
48+
/// This function adds an implicit parameter to the function containing a
49+
/// call instruction to the implicit offset intrinsic or another function
50+
/// (which eventually calls the instrinsic). If the call instruction is to
51+
/// the implicit offset intrinsic, then the intrinisic is replaced with the
52+
/// parameter that was added.
53+
///
54+
/// Once the function, say `F`, containing a call to `Callee` has the
55+
/// implicit parameter added, callers of `F` are processed by recursively
56+
/// calling this function, passing `F` to `CalleeWithImplicitParam`.
57+
///
58+
/// Since the cloning of entry points may alter the users of a function, the
59+
/// cloning must be done as early as possible, as to ensure that no users are
60+
/// added to previous callees in the call-tree.
61+
///
62+
/// \param Callee is the function (to which this transformation has already
63+
/// been applied), or to the implicit offset intrinsic.
64+
///
65+
/// \param CalleeWithImplicitParam indicates whether Callee is to the
66+
/// implicit intrinsic (when `nullptr`) or to another function (not
67+
/// `nullptr`) - this is used to know whether calls to it needs to have the
68+
/// implicit parameter added to it or replaced with the implicit parameter.
69+
void addImplicitParameterToCallers(Module &M, Value *Callee,
70+
Function *CalleeWithImplicitParam);
71+
72+
/// For a given function `Func` extend signature to contain an implicit
73+
/// offset argument.
74+
///
75+
/// \param Func A function to add offset to.
76+
///
77+
/// \param ImplicitArgumentType Architecture dependant type of the implicit
78+
/// argument holding the global offset.
79+
///
80+
/// \param KeepOriginal If set to true, rather than splicing the old `Func`,
81+
/// keep it intact and create a clone of it with `_wit_offset` appended to
82+
/// the name.
83+
///
84+
/// \returns A pair of new function with the offset argument added and a
85+
/// pointer to the implicit argument (either a func argument or a bitcast
86+
/// turning it to the correct type).
87+
std::pair<Function *, Value *>
88+
addOffsetArgumentToFunction(Module &M, Function *Func,
89+
Type *ImplicitArgumentType = nullptr,
90+
bool KeepOriginal = false);
91+
92+
/// Create a mapping of kernel entry points to their metadata nodes. While
93+
/// iterating over kernels make sure that a given kernel entry point has no
94+
/// llvm uses.
95+
///
96+
/// \param KernelPayloads A collection of kernel functions present in a
97+
/// module `M`.
98+
///
99+
/// \returns A map of kernel functions to corresponding metadata nodes.
100+
DenseMap<Function *, MDNode *>
101+
generateKernelMDNodeMap(Module &M,
102+
SmallVectorImpl<KernelPayload> &KernelPayloads);
103+
104+
private:
105+
/// Keep track of which functions have been processed to avoid processing
106+
/// twice.
107+
llvm::DenseMap<Function *, Value *> ProcessedFunctions;
108+
/// Keep a map of all entry point functions with metadata.
109+
llvm::DenseMap<Function *, MDNode *> EntryPointMetadata;
110+
/// A type of implicit argument added to the kernel signature.
111+
llvm::Type *KernelImplicitArgumentType = nullptr;
112+
/// A type used for the alloca holding the values of global offsets.
113+
llvm::Type *ImplicitOffsetPtrType = nullptr;
114+
115+
ArchType AT;
116+
unsigned TargetAS = 0;
117+
};
118+
119+
ModulePass *createGlobalOffsetPassLegacy();
120+
void initializeGlobalOffsetLegacyPass(PassRegistry &);
23121

24122
} // end namespace llvm
25123

llvm/include/llvm/SYCLLowerIR/LocalAccessorToSharedMemory.h

Lines changed: 50 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -5,24 +5,64 @@
55
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
66
//
77
//===----------------------------------------------------------------------===//
8-
//
9-
// This pass operates on SYCL kernels being compiled to CUDA. It modifies
10-
// kernel entry points which take pointers to shared memory and modifies them
11-
// to take offsets into shared memory (represented by a symbol in the shared
12-
// address space). The SYCL runtime is expected to provide offsets rather than
13-
// pointers to these functions.
14-
//
15-
//===----------------------------------------------------------------------===//
168

179
#ifndef LLVM_SYCL_LOCALACCESSORTOSHAREDMEMORY_H
1810
#define LLVM_SYCL_LOCALACCESSORTOSHAREDMEMORY_H
1911

2012
#include "llvm/IR/Module.h"
21-
#include "llvm/Pass.h"
13+
#include "llvm/IR/PassManager.h"
14+
#include "llvm/SYCLLowerIR/TargetHelpers.h"
2215

2316
namespace llvm {
2417

25-
ModulePass *createLocalAccessorToSharedMemoryPass();
18+
class ModulePass;
19+
class PassRegistry;
20+
21+
/// This pass operates on SYCL kernels. It modifies kernel entry points which
22+
/// take pointers to shared memory and alters them to take offsets into shared
23+
/// memory (represented by a symbol in the shared address space). The SYCL
24+
/// runtime is expected to provide offsets rather than pointers to these
25+
/// functions.
26+
class LocalAccessorToSharedMemoryPass
27+
: public PassInfoMixin<LocalAccessorToSharedMemoryPass> {
28+
private:
29+
using KernelPayload = TargetHelpers::KernelPayload;
30+
using ArchType = TargetHelpers::ArchType;
31+
32+
public:
33+
explicit LocalAccessorToSharedMemoryPass() {}
34+
35+
PreservedAnalyses run(Module &M, ModuleAnalysisManager &);
36+
static StringRef getPassName() {
37+
return "SYCL Local Accessor to Shared Memory";
38+
}
39+
40+
private:
41+
/// This function replaces pointers to shared memory with offsets to a global
42+
/// symbol in shared memory.
43+
/// It alters the signature of the kernel (pointer vs offset value) as well
44+
/// as the access (dereferencing the argument pointer vs GEP to the global
45+
/// symbol).
46+
///
47+
/// \param F The kernel to be processed.
48+
///
49+
/// \returns A new function with global symbol accesses.
50+
Function *processKernel(Module &M, Function *F);
51+
52+
/// Update kernel metadata to reflect the change in the signature.
53+
///
54+
/// \param A map of original kernels to the modified ones.
55+
void postProcessKernels(
56+
SmallVectorImpl<std::pair<Function *, KernelPayload>> &NewToOldKernels);
57+
58+
private:
59+
/// The value for NVVM's ADDRESS_SPACE_SHARED and AMD's LOCAL_ADDRESS happen
60+
/// to be 3.
61+
const unsigned SharedASValue = 3;
62+
};
63+
64+
ModulePass *createLocalAccessorToSharedMemoryPassLegacy();
65+
void initializeLocalAccessorToSharedMemoryLegacyPass(PassRegistry &);
2666

2767
} // end namespace llvm
2868

0 commit comments

Comments
 (0)