Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SimplifyLibCalls: Use default globals address space when building new global strings. #118729

Merged
merged 4 commits into from
Dec 5, 2024

Conversation

resistor
Copy link
Collaborator

@resistor resistor commented Dec 5, 2024

Writing a test for this transitively exposed a number of places in BuildLibCalls where
we were failing to propagate address spaces properly, which are additionally fixed.

… global strings.

Writing a test for this  transitively exposed a number of places in BuildLibCalls where
we were failing to propagate address spaces properly, which are additionally fixed.
@resistor resistor marked this pull request as ready for review December 5, 2024 02:11
@llvmbot
Copy link
Member

llvmbot commented Dec 5, 2024

@llvm/pr-subscribers-llvm-transforms

Author: Owen Anderson (resistor)

Changes

Writing a test for this transitively exposed a number of places in BuildLibCalls where
we were failing to propagate address spaces properly, which are additionally fixed.


Patch is 34.17 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/118729.diff

4 Files Affected:

  • (modified) llvm/include/llvm/Transforms/Utils/BuildLibCalls.h (+10-7)
  • (modified) llvm/lib/Transforms/Utils/BuildLibCalls.cpp (+66-47)
  • (modified) llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp (+55-33)
  • (added) llvm/test/Transforms/InstCombine/printf-addrspace.ll (+42)
diff --git a/llvm/include/llvm/Transforms/Utils/BuildLibCalls.h b/llvm/include/llvm/Transforms/Utils/BuildLibCalls.h
index a8fb38e7260043..3766c755baf4ff 100644
--- a/llvm/include/llvm/Transforms/Utils/BuildLibCalls.h
+++ b/llvm/include/llvm/Transforms/Utils/BuildLibCalls.h
@@ -247,7 +247,7 @@ namespace llvm {
 
   /// Emit a call to the malloc function.
   Value *emitMalloc(Value *Num, IRBuilderBase &B, const DataLayout &DL,
-                    const TargetLibraryInfo *TLI);
+                    const TargetLibraryInfo *TLI, unsigned AddrSpace);
 
   /// Emit a call to the calloc function.
   Value *emitCalloc(Value *Num, Value *Size, IRBuilderBase &B,
@@ -256,24 +256,27 @@ namespace llvm {
   /// Emit a call to the hot/cold operator new function.
   Value *emitHotColdNew(Value *Num, IRBuilderBase &B,
                         const TargetLibraryInfo *TLI, LibFunc NewFunc,
-                        uint8_t HotCold);
+                        uint8_t HotCold, unsigned AddrSpace);
   Value *emitHotColdNewNoThrow(Value *Num, Value *NoThrow, IRBuilderBase &B,
                                const TargetLibraryInfo *TLI, LibFunc NewFunc,
-                               uint8_t HotCold);
+                               uint8_t HotCold, unsigned AddrSpace);
   Value *emitHotColdNewAligned(Value *Num, Value *Align, IRBuilderBase &B,
                                const TargetLibraryInfo *TLI, LibFunc NewFunc,
-                               uint8_t HotCold);
+                               uint8_t HotCold, unsigned AddrSpace);
   Value *emitHotColdNewAlignedNoThrow(Value *Num, Value *Align, Value *NoThrow,
                                       IRBuilderBase &B,
                                       const TargetLibraryInfo *TLI,
-                                      LibFunc NewFunc, uint8_t HotCold);
+                                      LibFunc NewFunc, uint8_t HotCold,
+                                      unsigned AddrSpace);
   Value *emitHotColdSizeReturningNew(Value *Num, IRBuilderBase &B,
                                      const TargetLibraryInfo *TLI,
-                                     LibFunc NewFunc, uint8_t HotCold);
+                                     LibFunc NewFunc, uint8_t HotCold,
+                                     unsigned AddrSpace);
   Value *emitHotColdSizeReturningNewAligned(Value *Num, Value *Align,
                                             IRBuilderBase &B,
                                             const TargetLibraryInfo *TLI,
-                                            LibFunc NewFunc, uint8_t HotCold);
+                                            LibFunc NewFunc, uint8_t HotCold,
+                                            unsigned AddrSpace);
 }
 
 #endif
diff --git a/llvm/lib/Transforms/Utils/BuildLibCalls.cpp b/llvm/lib/Transforms/Utils/BuildLibCalls.cpp
index e4f4052e5e4815..c7aec524eedfd6 100644
--- a/llvm/lib/Transforms/Utils/BuildLibCalls.cpp
+++ b/llvm/lib/Transforms/Utils/BuildLibCalls.cpp
@@ -1545,20 +1545,20 @@ static Value *emitLibCall(LibFunc TheLibFunc, Type *ReturnType,
 
 Value *llvm::emitStrLen(Value *Ptr, IRBuilderBase &B, const DataLayout &DL,
                         const TargetLibraryInfo *TLI) {
-  Type *CharPtrTy = B.getPtrTy();
+  Type *CharPtrTy = Ptr->getType();
   Type *SizeTTy = getSizeTTy(B, TLI);
   return emitLibCall(LibFunc_strlen, SizeTTy, CharPtrTy, Ptr, B, TLI);
 }
 
 Value *llvm::emitStrDup(Value *Ptr, IRBuilderBase &B,
                         const TargetLibraryInfo *TLI) {
-  Type *CharPtrTy = B.getPtrTy();
+  Type *CharPtrTy = Ptr->getType();
   return emitLibCall(LibFunc_strdup, CharPtrTy, CharPtrTy, Ptr, B, TLI);
 }
 
 Value *llvm::emitStrChr(Value *Ptr, char C, IRBuilderBase &B,
                         const TargetLibraryInfo *TLI) {
-  Type *CharPtrTy = B.getPtrTy();
+  Type *CharPtrTy = Ptr->getType();
   Type *IntTy = getIntTy(B, TLI);
   return emitLibCall(LibFunc_strchr, CharPtrTy, {CharPtrTy, IntTy},
                      {Ptr, ConstantInt::get(IntTy, C)}, B, TLI);
@@ -1566,7 +1566,8 @@ Value *llvm::emitStrChr(Value *Ptr, char C, IRBuilderBase &B,
 
 Value *llvm::emitStrNCmp(Value *Ptr1, Value *Ptr2, Value *Len, IRBuilderBase &B,
                          const DataLayout &DL, const TargetLibraryInfo *TLI) {
-  Type *CharPtrTy = B.getPtrTy();
+  Type *CharPtrTy = Ptr1->getType();
+  assert(CharPtrTy == Ptr2->getType());
   Type *IntTy = getIntTy(B, TLI);
   Type *SizeTTy = getSizeTTy(B, TLI);
   return emitLibCall(
@@ -1578,20 +1579,23 @@ Value *llvm::emitStrNCmp(Value *Ptr1, Value *Ptr2, Value *Len, IRBuilderBase &B,
 Value *llvm::emitStrCpy(Value *Dst, Value *Src, IRBuilderBase &B,
                         const TargetLibraryInfo *TLI) {
   Type *CharPtrTy = Dst->getType();
+  assert(CharPtrTy == Src->getType());
   return emitLibCall(LibFunc_strcpy, CharPtrTy, {CharPtrTy, CharPtrTy},
                      {Dst, Src}, B, TLI);
 }
 
 Value *llvm::emitStpCpy(Value *Dst, Value *Src, IRBuilderBase &B,
                         const TargetLibraryInfo *TLI) {
-  Type *CharPtrTy = B.getPtrTy();
+  Type *CharPtrTy = Dst->getType();
+  assert(CharPtrTy == Src->getType());
   return emitLibCall(LibFunc_stpcpy, CharPtrTy, {CharPtrTy, CharPtrTy},
                      {Dst, Src}, B, TLI);
 }
 
 Value *llvm::emitStrNCpy(Value *Dst, Value *Src, Value *Len, IRBuilderBase &B,
                          const TargetLibraryInfo *TLI) {
-  Type *CharPtrTy = B.getPtrTy();
+  Type *CharPtrTy = Dst->getType();
+  assert(CharPtrTy == Src->getType());
   Type *SizeTTy = getSizeTTy(B, TLI);
   return emitLibCall(LibFunc_strncpy, CharPtrTy, {CharPtrTy, CharPtrTy, SizeTTy},
                      {Dst, Src, Len}, B, TLI);
@@ -1599,7 +1603,8 @@ Value *llvm::emitStrNCpy(Value *Dst, Value *Src, Value *Len, IRBuilderBase &B,
 
 Value *llvm::emitStpNCpy(Value *Dst, Value *Src, Value *Len, IRBuilderBase &B,
                          const TargetLibraryInfo *TLI) {
-  Type *CharPtrTy = B.getPtrTy();
+  Type *CharPtrTy = Dst->getType();
+  assert(CharPtrTy == Src->getType());
   Type *SizeTTy = getSizeTTy(B, TLI);
   return emitLibCall(LibFunc_stpncpy, CharPtrTy, {CharPtrTy, CharPtrTy, SizeTTy},
                      {Dst, Src, Len}, B, TLI);
@@ -1615,7 +1620,8 @@ Value *llvm::emitMemCpyChk(Value *Dst, Value *Src, Value *Len, Value *ObjSize,
   AttributeList AS;
   AS = AttributeList::get(M->getContext(), AttributeList::FunctionIndex,
                           Attribute::NoUnwind);
-  Type *VoidPtrTy = B.getPtrTy();
+  Type *VoidPtrTy = Dst->getType();
+  assert(VoidPtrTy == Src->getType());
   Type *SizeTTy = getSizeTTy(B, TLI);
   FunctionCallee MemCpy = getOrInsertLibFunc(M, *TLI, LibFunc_memcpy_chk,
       AttributeList::get(M->getContext(), AS), VoidPtrTy,
@@ -1629,7 +1635,8 @@ Value *llvm::emitMemCpyChk(Value *Dst, Value *Src, Value *Len, Value *ObjSize,
 
 Value *llvm::emitMemPCpy(Value *Dst, Value *Src, Value *Len, IRBuilderBase &B,
                          const DataLayout &DL, const TargetLibraryInfo *TLI) {
-  Type *VoidPtrTy = B.getPtrTy();
+  Type *VoidPtrTy = Dst->getType();
+  assert(VoidPtrTy == Src->getType());
   Type *SizeTTy = getSizeTTy(B, TLI);
   return emitLibCall(LibFunc_mempcpy, VoidPtrTy,
                      {VoidPtrTy, VoidPtrTy, SizeTTy},
@@ -1638,7 +1645,7 @@ Value *llvm::emitMemPCpy(Value *Dst, Value *Src, Value *Len, IRBuilderBase &B,
 
 Value *llvm::emitMemChr(Value *Ptr, Value *Val, Value *Len, IRBuilderBase &B,
                         const DataLayout &DL, const TargetLibraryInfo *TLI) {
-  Type *VoidPtrTy = B.getPtrTy();
+  Type *VoidPtrTy = Ptr->getType();
   Type *IntTy = getIntTy(B, TLI);
   Type *SizeTTy = getSizeTTy(B, TLI);
   return emitLibCall(LibFunc_memchr, VoidPtrTy,
@@ -1648,7 +1655,7 @@ Value *llvm::emitMemChr(Value *Ptr, Value *Val, Value *Len, IRBuilderBase &B,
 
 Value *llvm::emitMemRChr(Value *Ptr, Value *Val, Value *Len, IRBuilderBase &B,
                         const DataLayout &DL, const TargetLibraryInfo *TLI) {
-  Type *VoidPtrTy = B.getPtrTy();
+  Type *VoidPtrTy = Ptr->getType();
   Type *IntTy = getIntTy(B, TLI);
   Type *SizeTTy = getSizeTTy(B, TLI);
   return emitLibCall(LibFunc_memrchr, VoidPtrTy,
@@ -1658,7 +1665,8 @@ Value *llvm::emitMemRChr(Value *Ptr, Value *Val, Value *Len, IRBuilderBase &B,
 
 Value *llvm::emitMemCmp(Value *Ptr1, Value *Ptr2, Value *Len, IRBuilderBase &B,
                         const DataLayout &DL, const TargetLibraryInfo *TLI) {
-  Type *VoidPtrTy = B.getPtrTy();
+  Type *VoidPtrTy = Ptr1->getType();
+  assert(VoidPtrTy == Ptr2->getType());
   Type *IntTy = getIntTy(B, TLI);
   Type *SizeTTy = getSizeTTy(B, TLI);
   return emitLibCall(LibFunc_memcmp, IntTy,
@@ -1668,7 +1676,8 @@ Value *llvm::emitMemCmp(Value *Ptr1, Value *Ptr2, Value *Len, IRBuilderBase &B,
 
 Value *llvm::emitBCmp(Value *Ptr1, Value *Ptr2, Value *Len, IRBuilderBase &B,
                       const DataLayout &DL, const TargetLibraryInfo *TLI) {
-  Type *VoidPtrTy = B.getPtrTy();
+  Type *VoidPtrTy = Ptr1->getType();
+  assert(VoidPtrTy == Ptr2->getType());
   Type *IntTy = getIntTy(B, TLI);
   Type *SizeTTy = getSizeTTy(B, TLI);
   return emitLibCall(LibFunc_bcmp, IntTy,
@@ -1678,7 +1687,8 @@ Value *llvm::emitBCmp(Value *Ptr1, Value *Ptr2, Value *Len, IRBuilderBase &B,
 
 Value *llvm::emitMemCCpy(Value *Ptr1, Value *Ptr2, Value *Val, Value *Len,
                          IRBuilderBase &B, const TargetLibraryInfo *TLI) {
-  Type *VoidPtrTy = B.getPtrTy();
+  Type *VoidPtrTy = Ptr1->getType();
+  assert(VoidPtrTy == Ptr2->getType());
   Type *IntTy = getIntTy(B, TLI);
   Type *SizeTTy = getSizeTTy(B, TLI);
   return emitLibCall(LibFunc_memccpy, VoidPtrTy,
@@ -1689,7 +1699,8 @@ Value *llvm::emitMemCCpy(Value *Ptr1, Value *Ptr2, Value *Val, Value *Len,
 Value *llvm::emitSNPrintf(Value *Dest, Value *Size, Value *Fmt,
                           ArrayRef<Value *> VariadicArgs, IRBuilderBase &B,
                           const TargetLibraryInfo *TLI) {
-  Type *CharPtrTy = B.getPtrTy();
+  Type *CharPtrTy = Dest->getType();
+  assert(CharPtrTy == Fmt->getType());
   Type *IntTy = getIntTy(B, TLI);
   Type *SizeTTy = getSizeTTy(B, TLI);
   SmallVector<Value *, 8> Args{Dest, Size, Fmt};
@@ -1702,7 +1713,7 @@ Value *llvm::emitSNPrintf(Value *Dest, Value *Size, Value *Fmt,
 Value *llvm::emitSPrintf(Value *Dest, Value *Fmt,
                          ArrayRef<Value *> VariadicArgs, IRBuilderBase &B,
                          const TargetLibraryInfo *TLI) {
-  Type *CharPtrTy = B.getPtrTy();
+  Type *CharPtrTy = Dest->getType();
   Type *IntTy = getIntTy(B, TLI);
   SmallVector<Value *, 8> Args{Dest, Fmt};
   llvm::append_range(Args, VariadicArgs);
@@ -1713,7 +1724,8 @@ Value *llvm::emitSPrintf(Value *Dest, Value *Fmt,
 
 Value *llvm::emitStrCat(Value *Dest, Value *Src, IRBuilderBase &B,
                         const TargetLibraryInfo *TLI) {
-  Type *CharPtrTy = B.getPtrTy();
+  Type *CharPtrTy = Dest->getType();
+  assert(CharPtrTy == Src->getType());
   return emitLibCall(LibFunc_strcat, CharPtrTy,
                      {CharPtrTy, CharPtrTy},
                      {Dest, Src}, B, TLI);
@@ -1721,7 +1733,8 @@ Value *llvm::emitStrCat(Value *Dest, Value *Src, IRBuilderBase &B,
 
 Value *llvm::emitStrLCpy(Value *Dest, Value *Src, Value *Size, IRBuilderBase &B,
                          const TargetLibraryInfo *TLI) {
-  Type *CharPtrTy = B.getPtrTy();
+  Type *CharPtrTy = Dest->getType();
+  assert(CharPtrTy == Src->getType());
   Type *SizeTTy = getSizeTTy(B, TLI);
   return emitLibCall(LibFunc_strlcpy, SizeTTy,
                      {CharPtrTy, CharPtrTy, SizeTTy},
@@ -1730,7 +1743,8 @@ Value *llvm::emitStrLCpy(Value *Dest, Value *Src, Value *Size, IRBuilderBase &B,
 
 Value *llvm::emitStrLCat(Value *Dest, Value *Src, Value *Size, IRBuilderBase &B,
                          const TargetLibraryInfo *TLI) {
-  Type *CharPtrTy = B.getPtrTy();
+  Type *CharPtrTy = Dest->getType();
+  assert(CharPtrTy == Src->getType());
   Type *SizeTTy = getSizeTTy(B, TLI);
   return emitLibCall(LibFunc_strlcat, SizeTTy,
                      {CharPtrTy, CharPtrTy, SizeTTy},
@@ -1739,7 +1753,8 @@ Value *llvm::emitStrLCat(Value *Dest, Value *Src, Value *Size, IRBuilderBase &B,
 
 Value *llvm::emitStrNCat(Value *Dest, Value *Src, Value *Size, IRBuilderBase &B,
                          const TargetLibraryInfo *TLI) {
-  Type *CharPtrTy = B.getPtrTy();
+  Type *CharPtrTy = Dest->getType();
+  assert(CharPtrTy == Src->getType());
   Type *SizeTTy = getSizeTTy(B, TLI);
   return emitLibCall(LibFunc_strncat, CharPtrTy,
                      {CharPtrTy, CharPtrTy, SizeTTy},
@@ -1748,7 +1763,8 @@ Value *llvm::emitStrNCat(Value *Dest, Value *Src, Value *Size, IRBuilderBase &B,
 
 Value *llvm::emitVSNPrintf(Value *Dest, Value *Size, Value *Fmt, Value *VAList,
                            IRBuilderBase &B, const TargetLibraryInfo *TLI) {
-  Type *CharPtrTy = B.getPtrTy();
+  Type *CharPtrTy = Dest->getType();
+  assert(CharPtrTy == Fmt->getType());
   Type *IntTy = getIntTy(B, TLI);
   Type *SizeTTy = getSizeTTy(B, TLI);
   return emitLibCall(
@@ -1759,7 +1775,8 @@ Value *llvm::emitVSNPrintf(Value *Dest, Value *Size, Value *Fmt, Value *VAList,
 
 Value *llvm::emitVSPrintf(Value *Dest, Value *Fmt, Value *VAList,
                           IRBuilderBase &B, const TargetLibraryInfo *TLI) {
-  Type *CharPtrTy = B.getPtrTy();
+  Type *CharPtrTy = Dest->getType();
+  assert(CharPtrTy == Fmt->getType());
   Type *IntTy = getIntTy(B, TLI);
   return emitLibCall(LibFunc_vsprintf, IntTy,
                      {CharPtrTy, CharPtrTy, VAList->getType()},
@@ -1912,8 +1929,8 @@ Value *llvm::emitPutS(Value *Str, IRBuilderBase &B,
 
   Type *IntTy = getIntTy(B, TLI);
   StringRef PutsName = TLI->getName(LibFunc_puts);
-  FunctionCallee PutS = getOrInsertLibFunc(M, *TLI, LibFunc_puts, IntTy,
-                                           B.getPtrTy());
+  FunctionCallee PutS =
+      getOrInsertLibFunc(M, *TLI, LibFunc_puts, IntTy, Str->getType());
   inferNonMandatoryLibFuncAttrs(M, PutsName, *TLI);
   CallInst *CI = B.CreateCall(PutS, Str, PutsName);
   if (const Function *F =
@@ -1951,7 +1968,7 @@ Value *llvm::emitFPutS(Value *Str, Value *File, IRBuilderBase &B,
   Type *IntTy = getIntTy(B, TLI);
   StringRef FPutsName = TLI->getName(LibFunc_fputs);
   FunctionCallee F = getOrInsertLibFunc(M, *TLI, LibFunc_fputs, IntTy,
-                                        B.getPtrTy(), File->getType());
+                                        Str->getType(), File->getType());
   if (File->getType()->isPointerTy())
     inferNonMandatoryLibFuncAttrs(M, FPutsName, *TLI);
   CallInst *CI = B.CreateCall(F, {Str, File}, FPutsName);
@@ -1970,9 +1987,9 @@ Value *llvm::emitFWrite(Value *Ptr, Value *Size, Value *File, IRBuilderBase &B,
 
   Type *SizeTTy = getSizeTTy(B, TLI);
   StringRef FWriteName = TLI->getName(LibFunc_fwrite);
-  FunctionCallee F = getOrInsertLibFunc(M, *TLI, LibFunc_fwrite,
-                                        SizeTTy, B.getPtrTy(), SizeTTy,
-                                        SizeTTy, File->getType());
+  FunctionCallee F =
+      getOrInsertLibFunc(M, *TLI, LibFunc_fwrite, SizeTTy, Ptr->getType(),
+                         SizeTTy, SizeTTy, File->getType());
 
   if (File->getType()->isPointerTy())
     inferNonMandatoryLibFuncAttrs(M, FWriteName, *TLI);
@@ -1987,7 +2004,7 @@ Value *llvm::emitFWrite(Value *Ptr, Value *Size, Value *File, IRBuilderBase &B,
 }
 
 Value *llvm::emitMalloc(Value *Num, IRBuilderBase &B, const DataLayout &DL,
-                        const TargetLibraryInfo *TLI) {
+                        const TargetLibraryInfo *TLI, unsigned AddrSpace) {
   Module *M = B.GetInsertBlock()->getModule();
   if (!isLibFuncEmittable(M, TLI, LibFunc_malloc))
     return nullptr;
@@ -1995,7 +2012,7 @@ Value *llvm::emitMalloc(Value *Num, IRBuilderBase &B, const DataLayout &DL,
   StringRef MallocName = TLI->getName(LibFunc_malloc);
   Type *SizeTTy = getSizeTTy(B, TLI);
   FunctionCallee Malloc = getOrInsertLibFunc(M, *TLI, LibFunc_malloc,
-                                             B.getPtrTy(), SizeTTy);
+                                             B.getPtrTy(AddrSpace), SizeTTy);
   inferNonMandatoryLibFuncAttrs(M, MallocName, *TLI);
   CallInst *CI = B.CreateCall(Malloc, Num, MallocName);
 
@@ -2029,7 +2046,7 @@ Value *llvm::emitCalloc(Value *Num, Value *Size, IRBuilderBase &B,
 Value *llvm::emitHotColdSizeReturningNew(Value *Num, IRBuilderBase &B,
                                          const TargetLibraryInfo *TLI,
                                          LibFunc SizeFeedbackNewFunc,
-                                         uint8_t HotCold) {
+                                         uint8_t HotCold, unsigned AddrSpace) {
   Module *M = B.GetInsertBlock()->getModule();
   if (!isLibFuncEmittable(M, TLI, SizeFeedbackNewFunc))
     return nullptr;
@@ -2038,7 +2055,7 @@ Value *llvm::emitHotColdSizeReturningNew(Value *Num, IRBuilderBase &B,
 
   // __sized_ptr_t struct return type { void*, size_t }
   StructType *SizedPtrT =
-      StructType::get(M->getContext(), {B.getPtrTy(), Num->getType()});
+      StructType::get(M->getContext(), {B.getPtrTy(AddrSpace), Num->getType()});
   FunctionCallee Func =
       M->getOrInsertFunction(Name, SizedPtrT, Num->getType(), B.getInt8Ty());
   inferNonMandatoryLibFuncAttrs(M, Name, *TLI);
@@ -2050,11 +2067,9 @@ Value *llvm::emitHotColdSizeReturningNew(Value *Num, IRBuilderBase &B,
   return CI;
 }
 
-Value *llvm::emitHotColdSizeReturningNewAligned(Value *Num, Value *Align,
-                                                IRBuilderBase &B,
-                                                const TargetLibraryInfo *TLI,
-                                                LibFunc SizeFeedbackNewFunc,
-                                                uint8_t HotCold) {
+Value *llvm::emitHotColdSizeReturningNewAligned(
+    Value *Num, Value *Align, IRBuilderBase &B, const TargetLibraryInfo *TLI,
+    LibFunc SizeFeedbackNewFunc, uint8_t HotCold, unsigned AddrSpace) {
   Module *M = B.GetInsertBlock()->getModule();
   if (!isLibFuncEmittable(M, TLI, SizeFeedbackNewFunc))
     return nullptr;
@@ -2063,7 +2078,7 @@ Value *llvm::emitHotColdSizeReturningNewAligned(Value *Num, Value *Align,
 
   // __sized_ptr_t struct return type { void*, size_t }
   StructType *SizedPtrT =
-      StructType::get(M->getContext(), {B.getPtrTy(), Num->getType()});
+      StructType::get(M->getContext(), {B.getPtrTy(AddrSpace), Num->getType()});
   FunctionCallee Func = M->getOrInsertFunction(Name, SizedPtrT, Num->getType(),
                                                Align->getType(), B.getInt8Ty());
   inferNonMandatoryLibFuncAttrs(M, Name, *TLI);
@@ -2078,13 +2093,13 @@ Value *llvm::emitHotColdSizeReturningNewAligned(Value *Num, Value *Align,
 
 Value *llvm::emitHotColdNew(Value *Num, IRBuilderBase &B,
                             const TargetLibraryInfo *TLI, LibFunc NewFunc,
-                            uint8_t HotCold) {
+                            uint8_t HotCold, unsigned AddrSpace) {
   Module *M = B.GetInsertBlock()->getModule();
   if (!isLibFuncEmittable(M, TLI, NewFunc))
     return nullptr;
 
   StringRef Name = TLI->getName(NewFunc);
-  FunctionCallee Func = M->getOrInsertFunction(Name, B.getPtrTy(),
+  FunctionCallee Func = M->getOrInsertFunction(Name, B.getPtrTy(AddrSpace),
                                                Num->getType(), B.getInt8Ty());
   inferNonMandatoryLibFuncAttrs(M, Name, *TLI);
   CallInst *CI = B.CreateCall(Func, {Num, B.getInt8(HotCold)}, Name);
@@ -2098,14 +2113,15 @@ Value *llvm::emitHotColdNew(Value *Num, IRBuilderBase &B,
 
 Value *llvm::emitHotColdNewNoThrow(Value *Num, Value *NoThrow, IRBuilderBase &B,
                                    const TargetLibraryInfo *TLI,
-                                   LibFunc NewFunc, uint8_t HotCold) {
+                                   LibFunc NewFunc, uint8_t HotCold,
+                                   unsigned AddrSpace) {
   Module *M = B.GetInsertBlock()->getModule();
   if (!isLibFuncEmittable(M, TLI, NewFunc))
     return nullptr;
 
   StringRef Name = TLI->getName(NewFunc);
   FunctionCallee Func =
-      M->getOrInsertFunction(Name, B.getPtrTy(), Num->getType(),
+      M->getOrInsertFunction(Name, B.getPtrTy(AddrSpace), Num->getType(),
                              NoThrow->getType(), B.getInt8Ty());
   inferNonMandatoryLibFuncAttrs(M, Name, *TLI);
   CallInst *CI = B.CreateCall(Func, {Num, NoThrow...
[truncated]

@nikic
Copy link
Contributor

nikic commented Dec 5, 2024

Looks like simplify-libcalls-new.ll asserts.

Copy link

github-actions bot commented Dec 5, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@resistor resistor force-pushed the simplifylibcalls-globals branch from fd05e86 to 458f1dd Compare December 5, 2024 10:26
Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

llvm/test/Transforms/InstCombine/printf-addrspace.ll Outdated Show resolved Hide resolved
@resistor resistor merged commit cfa582e into llvm:main Dec 5, 2024
5 of 7 checks passed
@percent_s = addrspace(200) constant [4 x i8] c"%s\0A\00"

declare i32 @printf(ptr addrspace(200) , ...)
declare i32 @puts(ptr addrspace(200))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned this isn't working the way we want.

puts takes a C char*, and the C library's char* is in some specific address-space. TargetLibraryInfo should inherently know the correct address-space for puts... and on usual targets, the inherent address-space should be zero. So TargetLibraryInfo should refuse to recognize this declaration, and the call shouldn't be transformed. The concept here is similar to size_t: the same way there's only one size_t on any given target, there's also only one char*.

Similarly, we shouldn't need to pass a type into emitMalloc and friends: malloc returns a void*, and a C void* only has one possible associated address-space.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CHERI purecap targets don't use address space 0 as the default address space for C pointers. (This may seem like an unusual choice, but there are reasons for it that are somewhat beyond the scope of one PR). My goal is to make sure this works. Ensuring propagation of address spaces is one way of achieving that: it essentially puts the burden of constructing the IR correctly on the frontend, and the middle-end will simply maintain it.

An alternative approach would be to add something like getVoidPtrTy to TargetLibraryInfo and have everything use it. This may be net slightly less code, but also means that there are now two sources of truth.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with not having TargetLibraryInfo::getVoidPtrTy() is that you accidentally inherit information from other places. If the source of that inheritance is other C library functions, you're fine, I guess, but that's extremely hard to audit. For example, you call emitPutS() with a pointer with address-space getDefaultGlobalsAddressSpace(). Is that right? We don't know, and nothing will ever check it.

I don't think having a separate source of truth for getVoidPtrTy() is really a big deal compared to all the other assumptions that TargetLibraryInfo makes. We can make clang codegen check that the TargetLibraryInfo is correct if you're concerned it'll fall out of sync somehow.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a quick look into making this change, and it breaks a few in-tree tests (which all look like the originated from CHERI targets...). What I'd like to do is, in a separate change, add the target triple bits needed to make it possible to detect CHERI from the target triple, then adapt the tests to use that properly, then make the switch to getVoidPtrType() with an override when CHERI is indicated by the triple.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #118927

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add the target triple bits needed to make it possible to detect CHERI from the target triple, then adapt the tests to use that properly, then make the switch to getVoidPtrType() with an override when CHERI is indicated by the triple

Sure, this seems fine.

Copy link
Collaborator

@jrtc27 jrtc27 Dec 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also shows up in atomic libcall handling. I don't really care whether it's called the "libcalls address space", "C ABI address space", "default address space" or some other thing like that, but it's a long-standing issue that LLVM only has the fine-grained definition-based address spaces (alloca, global, program) but no use-based / generic address spaces beyond 0, which CHERI backends do not use for various technical reasons (whether that's the right choice is a different matter, but this should be possible to do). I've always felt there should be some sort of -D<address space> (if using "default" above) in the data layout which is the address space of a pointer without any further context, and is what you can then map C's T * to. A -L<address space> for libcalls might be fine, though I'd be concerned it's still overly narrow.

@resistor Please tag @arichardson and myself on future CHERI-related PRs. Between us we have a lot of experience with CHERI LLVM and I want to ensure you're making upstream changes we agree with rather than (non-maliciously, I'm sure) going behind our backs and not involving the people who actually wrote a lot of the current CHERI LLVM code and maintain the compiler, and have had various conversations over the years on how to resolve outstanding issues and how to upstream what we have. It was a surprise to me at least that these PRs had been filed and merged.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #88182 for some related discussion, in particular #88182 (comment) onwards. It's important for anyone trying to upstream CHERI-related things to be aware of past discussion on this topic, and unless you want to search the internet for everything related to CHERI you only get that by talking to the people who have been involved in those kinds of discussions.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So TargetLibraryInfo should refuse to recognize this declaration, and the call shouldn't be transformed.

Well, that's not what came out of https://reviews.llvm.org/D95142, but that was in part due to not having any way to get the right address space for a libcall. What was universally agreed though was that picking a random one of the fine-grained address spaces was not correct (the initial version of the patch even used the globals address space just like this one here).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that was in part due to not having any way to get the right address space for a libcall

Right... like I mentioned before, I think we do want to tie libcalls to a specific address-space. It's way too easy to make mistakes otherwise in an environment with multiple address-spaces.

What was universally agreed though was that picking a random one of the fine-grained address spaces was not correct (the initial version of the patch even used the globals address space just like this one here).

Yes, none of the existing address-spaces in the datalayout matches the concept we want here.

@jrtc27
Copy link
Collaborator

jrtc27 commented Dec 9, 2024

I believe this commit should be reverted. There's no reason why the globals address space should be preferred over any other. There are some refactors in here that may be worth keeping, but they shouldn't be tangled up with functional changes.

broxigarchen pushed a commit to broxigarchen/llvm-project that referenced this pull request Dec 10, 2024
… global strings. (llvm#118729)

Writing a test for this transitively exposed a number of places in
BuildLibCalls where
we were failing to propagate address spaces properly, which are
additionally fixed.
resistor added a commit to resistor/llvm-project that referenced this pull request Dec 11, 2024
chrsmcgrr pushed a commit to RooflineAI/llvm-project that referenced this pull request Dec 12, 2024
… global strings. (llvm#118729)

Writing a test for this transitively exposed a number of places in
BuildLibCalls where
we were failing to propagate address spaces properly, which are
additionally fixed.
TIFitis pushed a commit to TIFitis/llvm-project that referenced this pull request Dec 18, 2024
… global strings. (llvm#118729)

Writing a test for this transitively exposed a number of places in
BuildLibCalls where
we were failing to propagate address spaces properly, which are
additionally fixed.
resistor added a commit that referenced this pull request Dec 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants