Skip to content

Commit

Permalink
Implementation for github triSYCL issue #126 and enterprise issue #4
Browse files Browse the repository at this point in the history
…and llvm-mirror#7 :

triSYCL issue #126 : triSYCL/triSYCL#126
Improve SYCL-args-flattening pass to work in quite more cases.

enterprise issue #4 : https://gitenterprise.xilinx.com/rkeryell/triSYCL/issues/4
Add missing SPIR calling convention to device functions.

enterprise issue llvm-mirror#7 : https://gitenterprise.xilinx.com/rkeryell/triSYCL/issues/7
Rename function and basic block names to avoid choking xocc with some
C++ lambda mangled names.

Squashed commit of the following:

commit 516bd0f3aba3554eedf6e76a4f338877db664b86
Author: Lin Ya Yu <yu810226@gmail.com>
Date:   Mon Feb 5 16:03:35 2018 +0000

    Minor fix for SYCLKernel.cpp

    1. Reorder the include headers to follow the dictionary order.
    2. Simplified the syntax.

commit bd7f02901143915765367df79cdf14558393d450
Author: Lin Ya Yu <yu810226@gmail.com>
Date:   Mon Feb 5 11:38:59 2018 +0000

    Minor fix for SYCLKernel.cpp

    Since we change the data structure from vector to SmallPtrSet. Change
    push_back to insert.
    Fix for ->

commit 03342bbf1740dbf36f4028d51f7fb751c0f03510
Author: Lin Ya Yu <yu810226@gmail.com>
Date:   Mon Feb 5 11:36:00 2018 +0000

    Minor fix for lib/SYCL/inSPIRation.cpp

    Since there would be a warning that the syntax might be ambiguous if we remove { here, put it back.

commit f1ee3c6240dc48d065433b672ab8dd55314a57c2
Author: Lin Ya Yu <yu810226@gmail.com>
Date:   Fri Feb 2 18:13:15 2018 +0000

    Fix for SYCL.h, SYCLArgsFlattening.cpp, and SYCLKernel.cpp

    Change data structure from vector to SmallPtrSet.
    Change function name, argument name and add comments for more clearer
    explanation of the code.
    fix Spurious space and typo

commit db40fc502961349ade4ccbb13541fc6f1d2049a9
Author: Lin Ya Yu <yu810226@gmail.com>
Date:   Fri Feb 2 18:11:11 2018 +0000

    Minor fix for inSPIRation.cpp and LoopIdiomRecognize.cpp

    Fix Spurious space and typo

commit d5fb8c8aef18f851e2aac580615a44c9ff59c612
Author: Lin Ya Yu <yu810226@gmail.com>
Date:   Thu Jan 11 20:56:09 2018 +0000

    Modify SYCL Argument flattening pass

    1. Modify SYCL Argument flattening pass for better complexity to find out functions having kernel as an ancestor.
    2. Add comment for SYCL Argument flattening pass.
    3. Clean up code for SYCLKernel.cpp

commit cd4672e8def78c729b4fe98d3e00b6ad1177b1f1
Author: Lin Ya Yu <yu810226@gmail.com>
Date:   Thu Jan 11 20:46:20 2018 +0000

    Add new functions to support detecting functions that have kernel as an ancestor

commit 27f9afb2f14d22e696bda48c8c1e536d04ec9755
Author: Lin Ya Yu <yu810226@gmail.com>
Date:   Thu Jan 11 12:07:58 2018 +0000

    Remove code that force not executing loop idiom pass for some functions in host side

    For host, skip loop idiom pass here, and execute it after SYCL
    arguments flattening pass.

commit ff75f05bce8963a580c660847d2e77aa1a83f0fb
Author: Lin Ya Yu <yu810226@gmail.com>
Date:   Tue Jan 9 12:04:55 2018 +0000

    Modify function to test if function is called by kernel/ Minor fix for files

commit 43f10dfbc66a0229e69923d5c377cea6ddc81889
Author: Lin Ya Yu <yu810226@gmail.com>
Date:   Tue Jan 9 11:43:32 2018 +0000

    Move code in SYCLCleanUpFuncName.cpp inside inSPIRation.cpp

commit 06763f74f0bf91959b19d0f315231dd44c0b5057
Author: Lin Ya Yu <yu810226@gmail.com>
Date:   Mon Jan 8 21:09:02 2018 +0000

    Fix SYCLArgsFlattening.cpp to deal with nested call

commit ca9dcbfd2348d0cd800b4dfbf37dced14cd35f6b
Author: Lin Ya Yu <yu810226@gmail.com>
Date:   Mon Jan 8 21:04:37 2018 +0000

    Modify SYCL pass name and add new function in SYCLKernel.cpp

    Modify pass SYCLModifySPIRFuncName.cpp to SYCLCleanUpFuncName.cpp
    Change SYCL function called by kernel names to have sycl_func_ prefix.
    Add new function detecting call site is kernel function in
    SYCLKernel.cpp
    Other minor fixes for comments or indentation.

commit 02e3781dfee75dac43ec93414c6452b94fa5e595
Author: Lin Ya Yu <yu810226@gmail.com>
Date:   Mon Jan 8 15:52:26 2018 +0000

    Minor fix for LoopIdiomRecognize.cpp

commit 2f1e1f5c0a2d8518103200c560eca4f3d9d9e109
Author: Lin Ya Yu <yu810226@gmail.com>
Date:   Mon Jan 8 15:50:55 2018 +0000

    Simplified inSPIRation.cpp

commit 727faab31b26c001f01c0b06074fb88a752c3ede
Author: Lin Ya Yu <yu810226@gmail.com>
Date:   Mon Jan 8 15:49:08 2018 +0000

    Remove copy-paste files. Keep them under SYCL directory only.

commit 2ed9262512163c389b3e70bc4eb8697b29ac1a86
Author: Lin Ya Yu <yu810226@gmail.com>
Date:   Tue Dec 19 14:43:31 2017 +0000

    Rename basic block name to avoid $ produced

    Solution for issue https://gitenterprise.xilinx.com/rkeryell/triSYCL/issues/7

commit 508c4c3edc1344e5a16633061733dcac808f5d91
Author: Lin Ya Yu <yu810226@gmail.com>
Date:   Tue Dec 5 19:44:59 2017 +0000

    Modify name and calling convention for SPIR functions

    1. SYCLModifySPIRFuncName.cpp pass is added for modifying function
    names for functions called in SYCL kernel.

    2. In inSPIRation.cpp, add code to modify functions called in SYCL
    kernel to have SPIR_FUNC calling convention.

commit 77c2d772c0bc60b8b855d23183fa30dba8fc5dab
Author: Lin Ya Yu <yu810226@gmail.com>
Date:   Tue Dec 5 14:25:24 2017 +0000

    Modify SYCLArgsFlattening.cpp

    This pass is modified for solving problem in example
    yu810226/triSYCL@70fb460#diff-bbbcbcbc399a038ddcd05dbbdb685226

    While the kernel is big enough, device compiler generated a function
    inside instantiate_kernel function in pre_kernel.ll file leading to
    argument flattening pass not working. This modification is aim to
    solve this.

commit 63f6a379b2632c5ae5d4393d44ddf31c1d734512
Author: Lin Ya Yu <yu810226@gmail.com>
Date:   Thu Nov 30 15:03:43 2017 +0000

    Modify LoopIdiomRecognize.cpp to skip loop idiom optimization for kernel and functions called in the kernel

    This branch is aim to solve problem trigger in this example:
    https://gitenterprise.xilinx.com/linyay/triSYCL/commit/2a0dbdb485293c32d1ce380a68d35ea897269b06#diff-4beb17f5a3dbac38d7cb72c001c1b27d

    Since argument promotion can only be done with GEP instructions that
    only used by loads. And if there are llvm.memcpy instruction, there
    will always a bitcast instruction before to cast the pointer, so force
    kernel and functions called in the kernel in per_kernel_caller.ll to
    skip loop idiom opt pass.

commit 6c9df37bd74b97f528866a22599d98cb837db140
Author: Lin Ya Yu <yu810226@gmail.com>
Date:   Mon Nov 13 11:50:32 2017 +0000

    Add EnableLoopIdiom to PassManagerBuilder to control loop-idiom pass execution in PassManagerBuilder
  • Loading branch information
yu810226 authored and keryell committed Feb 6, 2018
1 parent 5422f3a commit 54fd058
Show file tree
Hide file tree
Showing 6 changed files with 192 additions and 18 deletions.
16 changes: 16 additions & 0 deletions include/llvm/SYCL.h
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,9 @@
#include <cstddef>
#include <string>

#include "llvm/ADT/SmallPtrSet.h"
#include "llvm/Analysis/CallGraph.h"
#include "llvm/Analysis/CallGraphSCCPass.h"
#include "llvm/IR/Function.h"

namespace llvm {
Expand All @@ -24,6 +27,19 @@ namespace sycl {
/// Test if a function is a SYCL kernel
bool isKernel(const Function &F);

/// Test if functions having the kernel as an ancestor
bool isTransitivelyCalledFromKernel(Function &F,
SmallPtrSet<Function *, 32> &FunctionsCalledByKernel);

/// Add the functions that are transitively called from the kernel in the set
void recordFunctionsCalledByKernel(CallGraphSCC &SCC, CallGraph &CG,
SmallPtrSet<Function *, 32> &FunctionsCalledByKernel);

/// Update the FunctionsCalledByKernel set when new CallGraphNode created in
/// CallGraph
void updateFunctionsCalledByKernel (CallGraphNode &NewNode,
SmallPtrSet<Function *, 32> &FunctionsCalledByKernel);

/// Register a kernel with its full name and returns its ID
///
/// If the kernel is already registered, do not register it again.
Expand Down
1 change: 1 addition & 0 deletions include/llvm/Transforms/IPO/PassManagerBuilder.h
Original file line number Diff line number Diff line change
Expand Up @@ -157,6 +157,7 @@ class PassManagerBuilder {
bool PrepareForThinLTO;
bool PerformThinLTO;
bool DivergentTarget;
bool EnableLoopIdiom;

/// Enable profile instrumentation pass.
bool EnablePGOInstrGen;
Expand Down
90 changes: 78 additions & 12 deletions lib/SYCL/SYCLArgsFlattening.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@

using namespace llvm;

#define DEBUG_TYPE "SYCL"
#define DEBUG_TYPE "SYCL-args-flattening"

STATISTIC(NumArgumentsPromoted, "Number of pointer arguments promoted");
STATISTIC(NumAggregatesPromoted, "Number of aggregate arguments promoted");
Expand All @@ -106,6 +106,9 @@ STATISTIC(NumArgumentsDead, "Number of dead pointer args eliminated");
/// A vector used to hold the indices of a single GEP instruction
using IndicesVector = std::vector<uint64_t>;

/// \todo Describe & initialize somewhere
SmallPtrSet<Function*, 32> FunctionsCalledByKernel;

/// DoPromotion - This method actually performs the promotion of the specified
/// arguments, and returns the new function. At this point, we know that it's
/// safe to do so.
Expand Down Expand Up @@ -572,9 +575,11 @@ static void markIndicesSafe(const IndicesVector &ToMark,
/// elements of the aggregate in order to avoid exploding the number of
/// arguments passed in.
static bool isSafeToPromoteArgument(Argument *Arg, bool isByValOrInAlloca,
AAResults &AAR, unsigned MaxElements) {
AAResults &AAR, unsigned MaxElements,
CallGraph &CG) {
using GEPIndicesSet = std::set<IndicesVector>;

Function *F = Arg->getParent();
// Quick exit for unused arguments
if (Arg->use_empty())
return true;
Expand Down Expand Up @@ -607,6 +612,23 @@ static bool isSafeToPromoteArgument(Argument *Arg, bool isByValOrInAlloca,
if (isByValOrInAlloca || allCallersPassInValidPointerForArgument(Arg))
SafeToUnconditionallyLoad.insert(IndicesVector(1, 0));

// isRelatedToKernel is for marking functions called in kernel.
bool isRelatedToKernel = false;

// If the function has ancestor kernel, force to do argument promotion
if (sycl::isTransitivelyCalledFromKernel(*F, FunctionsCalledByKernel)) {
DEBUG(dbgs() << "SYCL: " << F->getName() << "has ancestor kernel.\n");
isRelatedToKernel = true;
}

// If the function is called in kernel, force to make any load with first
// index 0 is valid.
if (isRelatedToKernel) {
SafeToUnconditionallyLoad.insert(IndicesVector(1, 0));
DEBUG(dbgs() << "SYCL: " << F->getName()
<< " force to make any load with first index 0 is valid.\n");
}

// First, iterate the entry block and mark loads of (geps of) arguments as
// safe.
BasicBlock &EntryBlock = Arg->getParent()->front();
Expand All @@ -624,10 +646,15 @@ static bool isSafeToPromoteArgument(Argument *Arg, bool isByValOrInAlloca,
II != IE; ++II)
if (ConstantInt *CI = dyn_cast<ConstantInt>(*II))
Indices.push_back(CI->getSExtValue());
else
else {
// We found a non-constant GEP index for this argument? Bail out
// right away, can't promote this argument at all.
return false;
DEBUG(dbgs() << "SYCL: " << Arg->getName()
<< " in " << F->getName()
<< " used in non-constant GEP index.\n");
return false;
}

// Indices checked out, mark them as safe
markIndicesSafe(Indices, SafeToUnconditionallyLoad);
Expand Down Expand Up @@ -661,17 +688,24 @@ static bool isSafeToPromoteArgument(Argument *Arg, bool isByValOrInAlloca,
// TODO: This runs the above loop over and over again for dead GEPs
// Couldn't we just do increment the UI iterator earlier and erase the
// use?
DEBUG(dbgs() << "SYCL: Dead GEP" << *GEP << "\n");
return isSafeToPromoteArgument(Arg, isByValOrInAlloca, AAR,
MaxElements);
MaxElements, CG);
}

// Ensure that all of the indices are constants.
for (User::op_iterator i = GEP->idx_begin(), e = GEP->idx_end(); i != e;
++i)
if (ConstantInt *C = dyn_cast<ConstantInt>(*i))
++i) {
DEBUG(dbgs() << "SYCL: " << Arg->getName() << " used in GEP: "
<< *GEP << "\n");
if (ConstantInt *C = dyn_cast<ConstantInt>(*i)) {
DEBUG(dbgs() << C->getSExtValue() << " constant extend value.\n");
Operands.push_back(C->getSExtValue());
else
} else {
DEBUG(dbgs() << "Not a constant operand GEP.\n");
return false; // Not a constant operand GEP!
}
}

// Ensure that the only users of the GEP are load instructions.
for (User *GEPU : GEP->users())
Expand All @@ -682,16 +716,24 @@ static bool isSafeToPromoteArgument(Argument *Arg, bool isByValOrInAlloca,
Loads.push_back(LI);
} else {
// Other uses than load?
DEBUG(dbgs() << "SYCL: " << Arg->getName() << " used in GEP: "
<< *GEP << " in " << F->getName() << "\n"
<< "User: " << *GEPU << "\n");
return false;
}
} else {
DEBUG(dbgs() << "SYCL: " << Arg->getName() << " used in " << F->getName()
<< " is not load or GEP.\n" << "User: " << *UR << "\n");
return false; // Not a load or a GEP.
}

// Now, see if it is safe to promote this load / loads of this GEP. Loading
// is safe if Operands, or a prefix of Operands, is marked as safe.
if (!prefixIn(Operands, SafeToUnconditionallyLoad))
if (!prefixIn(Operands, SafeToUnconditionallyLoad)) {
DEBUG(dbgs() << "SYCL: " << Arg->getName() << " used in "
<< F->getName() << " is not PrefixIn.\n");
return false;
}

// See if we are already promoting a load with these indices. If not, check
// to make sure that we aren't promoting too many elements. If so, nothing
Expand Down Expand Up @@ -719,6 +761,11 @@ static bool isSafeToPromoteArgument(Argument *Arg, bool isByValOrInAlloca,
// check to see if the pointer is guaranteed to not be modified from entry of
// the function to each of the load instructions.

// However, if the function has kernel as an ancestor, the argument is
// guaranteed to not be modified from the start of the block to the load
// instruction itself.
// isRelatedToKernel is added here to determine if we are in this situation.

// Because there could be several/many load instructions, remember which
// blocks we know to be transparent to the load.
df_iterator_default_set<BasicBlock *, 16> TranspBlocks;
Expand All @@ -729,16 +776,24 @@ static bool isSafeToPromoteArgument(Argument *Arg, bool isByValOrInAlloca,
BasicBlock *BB = Load->getParent();

MemoryLocation Loc = MemoryLocation::get(Load);
if (AAR.canInstructionRangeModRef(BB->front(), *Load, Loc, ModRefInfo::Mod))
if (AAR.canInstructionRangeModRef(BB->front(), *Load, Loc, ModRefInfo::Mod)
&& !isRelatedToKernel) {
DEBUG(dbgs() << "SYCL: " << Arg->getName() << " used in " << F->getName()
<< " but " << *Load << " is invalidated.\n");
return false; // Pointer is invalidated!
}

// Now check every path from the entry block to the load for transparency.
// To do this, we perform a depth first search on the inverse CFG from the
// loading block.
for (BasicBlock *P : predecessors(BB)) {
for (BasicBlock *TranspBB : inverse_depth_first_ext(P, TranspBlocks))
if (AAR.canBasicBlockModify(*TranspBB, Loc))
if (AAR.canBasicBlockModify(*TranspBB, Loc) && !isRelatedToKernel) {
DEBUG(dbgs() << "SYCL: " << Arg->getName() << " used in "
<< F->getName()
+ << " every path from the entry block to the load is not transparency.\n");
return false;
}
}
}

Expand Down Expand Up @@ -892,6 +947,13 @@ promoteArguments(Function *F, function_ref<AAResults &(Function &F)> AARGetter,
bool isSafeToPromote =
PtrArg->hasByValAttr() &&
(isDenselyPacked(AgTy, DL) || !canPaddingBeAccessed(PtrArg));

// If the function has ancestor kernel, force to do argument promotion
if (sycl::isTransitivelyCalledFromKernel(*F, FunctionsCalledByKernel)) {
DEBUG(dbgs() << "SYCL: " << F->getName() << "has ancestor kernel.\n");
isSafeToPromote = true;
}

if (isSafeToPromote) {
if (StructType *STy = dyn_cast<StructType>(AgTy)) {
if (MaxElements > 0 && STy->getNumElements() > MaxElements) {
Expand Down Expand Up @@ -940,7 +1002,7 @@ promoteArguments(Function *F, function_ref<AAResults &(Function &F)> AARGetter,

// Otherwise, see if we can promote the pointer to its value.
if (isSafeToPromoteArgument(PtrArg, PtrArg->hasByValOrInAllocaAttr(), AAR,
MaxElements))
MaxElements, CG))
ArgsToPromote.insert(PtrArg);
}

Expand Down Expand Up @@ -1040,7 +1102,6 @@ INITIALIZE_PASS_DEPENDENCY(CallGraphWrapperPass)
INITIALIZE_PASS_DEPENDENCY(TargetLibraryInfoWrapperPass)
INITIALIZE_PASS_END(SYCLArgsFlattening, "SYCL-args-flattening",
"Promote 'by reference' arguments to scalars", false, false)

/// Create a struct to call the pass initialization at load time
struct InitSYCLArgsFlattening {
InitSYCLArgsFlattening() {
Expand All @@ -1062,6 +1123,9 @@ bool SYCLArgsFlattening::runOnSCC(CallGraphSCC &SCC) {
// changes.
CallGraph &CG = getAnalysis<CallGraphWrapperPass>().getCallGraph();

// Record the functions that are transitively called from kernel
sycl::recordFunctionsCalledByKernel(SCC, CG, FunctionsCalledByKernel);

LegacyAARGetter AARGetter(*this);

bool Changed = false, LocalChange;
Expand Down Expand Up @@ -1097,6 +1161,8 @@ bool SYCLArgsFlattening::runOnSCC(CallGraphSCC &SCC) {

// And updat ethe SCC we're iterating as well.
SCC.ReplaceNode(OldNode, NewNode);
// Update new node function for FunctionsCalledByKernel set
sycl::updateFunctionsCalledByKernel(*NewNode, FunctionsCalledByKernel);
}
}
// Remember that we changed something.
Expand Down
49 changes: 48 additions & 1 deletion lib/SYCL/SYCLKernel.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -7,18 +7,26 @@
//
//===----------------------------------------------------------------------===//
//
// Detect SYCL kernels and rename kernel too shorten unique names
// Detect SYCL kernels and rename kernel to shorten unique names
//
// Detect if functions have kernel as an ancestor
// ===------------------------------------------------------------------- -===//

#include <algorithm>
#include <cstdlib>
#include <map>
#include <memory>
#include <sstream>
#include <string>

#include "llvm/ADT/SCCIterator.h"
#include "llvm/ADT/SmallPtrSet.h"
#include "llvm/ADT/StringRef.h"
#include "llvm/Analysis/CallGraph.h"
#include "llvm/Analysis/CallGraphSCCPass.h"
// Wait for LLVM 4.0...
// #include "llvm/Demangle/Demangle.h"
#include "llvm/IR/CallSite.h"
#include "llvm/IR/Function.h"
#include "llvm/SYCL.h"
#include "llvm/Support/Debug.h"
Expand Down Expand Up @@ -71,6 +79,45 @@ bool isKernel(const Function &F) {
return KernelFound;
}

/// Test if functions having the kernel as an ancestor
bool isTransitivelyCalledFromKernel(Function &F,
SmallPtrSet<Function *, 32> &FunctionsCalledByKernel) {
for (auto &U : F.uses()) {
CallSite CS{U.getUser()};
if (auto I = CS.getInstruction()) {
auto parent = I->getParent()->getParent();
return FunctionsCalledByKernel.count(parent);
}
}

return false;
}

/// Add the functions that are transitively called from the kernel in the set
void recordFunctionsCalledByKernel(CallGraphSCC &SCC, CallGraph &CG,
SmallPtrSet<Function *, 32> &FunctionsCalledByKernel) {
// Find the CallGraphNode that the kernel function belongs to.
// Then, DFS algorithm starts from the kernel function CallGraphNode to
// discover all the functions that have kernel as an ancestor and add them to
// the FunctionsCalledByKernel set.
// \note for (CallGraphNode *I : SCC) {} will run in bottom-up order
for (auto SCCI = scc_begin(&CG); !SCCI.isAtEnd(); ++SCCI) {
auto const &nextSCC = *SCCI;
for (auto I : nextSCC)
if(auto *F = I->getFunction())
if (isKernel(*F) || isTransitivelyCalledFromKernel(*F, FunctionsCalledByKernel))
FunctionsCalledByKernel.insert(F);
}
}

/// Update the FunctionsCalledByKernel set when new CallGraphNode created in
/// CallGraph
void updateFunctionsCalledByKernel (CallGraphNode &NewNode,
SmallPtrSet<Function *, 32> &FunctionsCalledByKernel) {
auto *F = NewNode.getFunction();
if (isTransitivelyCalledFromKernel(*F, FunctionsCalledByKernel))
FunctionsCalledByKernel.insert(F);
}

/// Mapping from the full kernel mangle named to a unique integer ID
std::map<std::string, std::size_t> SimplerKernelNames;
Expand Down
Loading

0 comments on commit 54fd058

Please sign in to comment.