Skip to content

[IR][DSE] Support non-malloc functions in malloc+memset->calloc fold #138299

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 4, 2025

Conversation

clubby789
Copy link
Contributor

@clubby789 clubby789 commented May 2, 2025

Add a alloc-variant-zeroed function attribute which can be used to inform folding allocation+memset. This addresses rust-lang/rust#104847, where LLVM does not know how to perform this transformation for non-C languages.

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way to do this is probably to add an extra attribute along the lines of "alloc-variant-zeroed"="calloc"?

@clubby789 clubby789 force-pushed the more-alloc-zeroed branch from 6e08081 to 3ca30dc Compare May 2, 2025 18:00
@clubby789 clubby789 marked this pull request as ready for review May 2, 2025 18:01
@clubby789 clubby789 force-pushed the more-alloc-zeroed branch from 3ca30dc to f94c92c Compare May 2, 2025 18:02
@llvmbot
Copy link
Member

llvmbot commented May 2, 2025

@llvm/pr-subscribers-llvm-ir

@llvm/pr-subscribers-llvm-transforms

Author: None (clubby789)

Changes

Add a alloc-variant-zeroed function attribute which can be used to inform folding allocation+memset. This addresses rust-lang/rust#104847, where LLVM does not know how to perform this transformation for non-C languages.


Full diff: https://github.com/llvm/llvm-project/pull/138299.diff

3 Files Affected:

  • (modified) llvm/docs/LangRef.rst (+4)
  • (modified) llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp (+41-8)
  • (modified) llvm/test/Transforms/DeadStoreElimination/noop-stores.ll (+13)
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 568843a4486e5..78718f102e19c 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -1954,6 +1954,10 @@ For example:
     The first three options are mutually exclusive, and the remaining options
     describe more details of how the function behaves. The remaining options
     are invalid for "free"-type functions.
+``"alloc-variant-zeroed"="FUNCTION"``
+    This attribute indicates that another function is equivalent to an allocator function,
+    but returns zeroed memory. The function must have "zeroed" allocation behavior,
+    the same ``alloc-family``, and take exactly the same arguments.
 ``allocsize(<EltSizeParam>[, <NumEltsParam>])``
     This attribute indicates that the annotated function will always return at
     least a given number of bytes (or null). Its arguments are zero-indexed
diff --git a/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp b/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
index e318ec94db4c3..9ba13450c1fdb 100644
--- a/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
+++ b/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
@@ -2028,9 +2028,19 @@ struct DSEState {
     if (!InnerCallee)
       return false;
     LibFunc Func;
+    std::optional<StringRef> ZeroedVariantName = std::nullopt;
     if (!TLI.getLibFunc(*InnerCallee, Func) || !TLI.has(Func) ||
-        Func != LibFunc_malloc)
-      return false;
+        Func != LibFunc_malloc) {
+      if (!Malloc->hasFnAttr("alloc-variant-zeroed") ||
+          Malloc->getFnAttr("alloc-variant-zeroed")
+              .getValueAsString()
+              .empty()) {
+        return false;
+      }
+      ZeroedVariantName =
+          Malloc->getFnAttr("alloc-variant-zeroed").getValueAsString();
+    }
+
     // Gracefully handle malloc with unexpected memory attributes.
     auto *MallocDef = dyn_cast_or_null<MemoryDef>(MSSA.getMemoryAccess(Malloc));
     if (!MallocDef)
@@ -2057,15 +2067,38 @@ struct DSEState {
 
     if (Malloc->getOperand(0) != MemSet->getLength())
       return false;
-    if (!shouldCreateCalloc(Malloc, MemSet) ||
-        !DT.dominates(Malloc, MemSet) ||
+    if (!shouldCreateCalloc(Malloc, MemSet) || !DT.dominates(Malloc, MemSet) ||
         !memoryIsNotModifiedBetween(Malloc, MemSet, BatchAA, DL, &DT))
       return false;
     IRBuilder<> IRB(Malloc);
-    Type *SizeTTy = Malloc->getArgOperand(0)->getType();
-    auto *Calloc =
-        emitCalloc(ConstantInt::get(SizeTTy, 1), Malloc->getArgOperand(0), IRB,
-                   TLI, Malloc->getType()->getPointerAddressSpace());
+    assert(Func == LibFunc_malloc || ZeroedVariantName.has_value());
+    Value *Calloc = nullptr;
+    if (ZeroedVariantName.has_value()) {
+      auto *ZeroedVariant =
+          Malloc->getModule()->getFunction(*ZeroedVariantName);
+      if (!ZeroedVariant)
+        return false;
+      auto Attributes = ZeroedVariant->getAttributes();
+      auto MallocFamily = getAllocationFamily(Malloc, &TLI);
+      if (MallocFamily &&
+          *MallocFamily !=
+              Attributes.getFnAttr("alloc-family").getValueAsString())
+        return false;
+      if (!Attributes.hasFnAttr(Attribute::AllocKind) ||
+          (Attributes.getAllocKind() & AllocFnKind::Zeroed) ==
+              AllocFnKind::Unknown)
+        return false;
+
+      SmallVector<Value *, 3> Args;
+      for (unsigned I = 0; I < Malloc->arg_size(); I++)
+        Args.push_back(Malloc->getArgOperand(I));
+      Calloc = IRB.CreateCall(ZeroedVariant, Args, *ZeroedVariantName);
+    } else {
+      Type *SizeTTy = Malloc->getArgOperand(0)->getType();
+      Calloc =
+          emitCalloc(ConstantInt::get(SizeTTy, 1), Malloc->getArgOperand(0),
+                     IRB, TLI, Malloc->getType()->getPointerAddressSpace());
+    }
     if (!Calloc)
       return false;
 
diff --git a/llvm/test/Transforms/DeadStoreElimination/noop-stores.ll b/llvm/test/Transforms/DeadStoreElimination/noop-stores.ll
index 9fc20d76da5eb..e54d93015d587 100644
--- a/llvm/test/Transforms/DeadStoreElimination/noop-stores.ll
+++ b/llvm/test/Transforms/DeadStoreElimination/noop-stores.ll
@@ -374,6 +374,19 @@ define ptr @notmalloc_memset(i64 %size, ptr %notmalloc) {
   ret ptr %call1
 }
 
+; This should create a customalloc_zeroed
+define ptr @customalloc_memset(i64 %size, i64 %align) {
+; CHECK-LABEL: @customalloc_memset
+; CHECK-NEXT:  [[CALL:%.*]] = call ptr @customalloc_zeroed(i64 [[SIZE:%.*]], i64 [[ALIGN:%.*]])
+; CHECK-NEXT:  ret ptr [[CALL]]
+  %call = call ptr @customalloc(i64 %size, i64 %align)
+  call void @llvm.memset.p0.i64(ptr %call, i8 0, i64 %size, i1 false)
+  ret ptr %call
+}
+
+declare ptr @customalloc(i64, i64) allockind("alloc") "alloc-family"="customalloc" "alloc-variant-zeroed"="customalloc_zeroed"
+declare ptr @customalloc_zeroed(i64, i64) allockind("alloc,zeroed") "alloc-family"="customalloc"
+
 ; This should not create recursive call to calloc.
 define ptr @calloc(i64 %nmemb, i64 %size) inaccessiblememonly {
 ; CHECK-LABEL: @calloc(

@clubby789 clubby789 force-pushed the more-alloc-zeroed branch from f94c92c to c346cf7 Compare May 2, 2025 21:24
@pinskia
Copy link

pinskia commented May 2, 2025

And it's probably not worthwhile to try to extend it in a way to make it work.

I think it is worth it because there are some programs out there that use xmalloc and it might be useful to optimize that to xcalloc.
In the GCC bug about the new attribute (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120072) I mentioned about this specific thing.

@clubby789
Copy link
Contributor Author

I think it is worth it because there are some programs out there that use xmalloc and it might be useful to optimize that to xcalloc.

Sorry, I'm not sure I understand this. I think the original question is about extending the system to be smart enough to transform alloc(size) to calloc(1, size). Whereas xmalloc->xcalloc can be easily done by a frontend, and just requires adding "alloc-variant-zeroed"="xcalloc" to xmalloc.

@pinskia
Copy link

pinskia commented May 3, 2025

I think it is worth it because there are some programs out there that use xmalloc and it might be useful to optimize that to xcalloc.

Sorry, I'm not sure I understand this. I think the original question is about extending the system to be smart enough to transform alloc(size) to calloc(1, size). Whereas xmalloc->xcalloc can be easily done by a frontend, and just requires adding "alloc-variant-zeroed"="xcalloc" to xmalloc.

xcalloc still takes the same arguments as calloc. Likewise of xmalloc.
So it is similar in nature to malloc->calloc transformation.

@clubby789
Copy link
Contributor Author

My mistake, I misread the definition. Would it make more sense to have a different attribute for calloc-like functions that have their arguments remapped (alloc-calloc-variant perhaps?). Or a more general system allowing specific mappings to be described

@nikic
Copy link
Contributor

nikic commented May 3, 2025

As far as I know, Clang currently doesn't support declaring allocators in C using attributes (it only supports noalias via __attribute__((malloc)), but not alloc-family/allockind/allocptr/allocalign). So I'm not sure it makes sense to try to support something like xmalloc->xcalloc conversion on the LLVM side, when Clang doesn't even expose the basic functionality yet.

@nikic nikic requested review from durin42, fhahn, aeubanks and dtcxzyw May 3, 2025 07:53
@nikic nikic changed the title Support non-malloc functions in malloc+memset->calloc fold [IR][DSE] Support non-malloc functions in malloc+memset->calloc fold May 3, 2025
@clubby789 clubby789 force-pushed the more-alloc-zeroed branch from c346cf7 to 89d8649 Compare May 3, 2025 13:17
@clubby789 clubby789 force-pushed the more-alloc-zeroed branch from 89d8649 to 910c21e Compare May 3, 2025 13:33
@durin42
Copy link
Contributor

durin42 commented May 9, 2025

This seems fine to me - it's a little bit of a bummer there's no way to just find the matching zeroed variant of a function without having to add the attribute, but I can't think of anything.

@clubby789 clubby789 force-pushed the more-alloc-zeroed branch 2 times, most recently from 26fca54 to eba5e9e Compare May 12, 2025 17:06
@clubby789
Copy link
Contributor Author

Hi, are any more changes desired here?

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs a rebase, but generally looks good to me.

@clubby789 clubby789 force-pushed the more-alloc-zeroed branch from eba5e9e to e1729a7 Compare June 3, 2025 13:40
Copy link

github-actions bot commented Jun 3, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@clubby789 clubby789 force-pushed the more-alloc-zeroed branch from e1729a7 to c7e7901 Compare June 3, 2025 13:51
@nikic nikic merged commit c7c79d2 into llvm:main Jun 4, 2025
12 checks passed
@clubby789
Copy link
Contributor Author

LibFunc is uninitialized but is compared in the assert, I'll open a PR shortly to address the msan failures

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants