-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[AMDGPU] Avoid resource propagation for recursion through multiple functions #111004
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-backend-amdgpu Author: Janek van Oirschot (JanekvO) ChangesAvoid constructing recursive MCExpr definitions when multiple functions cause a recursion. Fixes #110863 Full diff: https://github.com/llvm/llvm-project/pull/111004.diff 9 Files Affected:
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUMCResourceInfo.cpp b/llvm/lib/Target/AMDGPU/AMDGPUMCResourceInfo.cpp
index da0397fa20bd1b..dd970c78e66e91 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUMCResourceInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUMCResourceInfo.cpp
@@ -109,10 +109,13 @@ void MCResourceInfo::assignResourceInfoExpr(
for (const Function *Callee : Callees) {
if (!Seen.insert(Callee).second)
continue;
+ if (!F.doesNotRecurse() && !Callee->doesNotRecurse())
+ continue;
MCSymbol *CalleeValSym = getSymbol(Callee->getName(), RIK, OutContext);
ArgExprs.push_back(MCSymbolRefExpr::create(CalleeValSym, OutContext));
}
- SymVal = AMDGPUMCExpr::create(Kind, ArgExprs, OutContext);
+ if (ArgExprs.size() > 1)
+ SymVal = AMDGPUMCExpr::create(Kind, ArgExprs, OutContext);
}
MCSymbol *Sym = getSymbol(MF.getName(), RIK, OutContext);
Sym->setVariableValue(SymVal);
@@ -164,6 +167,8 @@ void MCResourceInfo::gatherResourceInfo(
for (const Function *Callee : FRI.Callees) {
if (!Seen.insert(Callee).second)
continue;
+ if (!MF.getFunction().doesNotRecurse() && !Callee->doesNotRecurse())
+ continue;
if (!Callee->isDeclaration()) {
MCSymbol *calleeValSym =
getSymbol(Callee->getName(), RIK_PrivateSegSize, OutContext);
diff --git a/llvm/test/CodeGen/AMDGPU/agpr-register-count.ll b/llvm/test/CodeGen/AMDGPU/agpr-register-count.ll
index 0e16ea10c019ac..189b2d80827896 100644
--- a/llvm/test/CodeGen/AMDGPU/agpr-register-count.ll
+++ b/llvm/test/CodeGen/AMDGPU/agpr-register-count.ll
@@ -178,4 +178,4 @@ bb:
; GCN-NEXT: .set amdgpu.max_num_agpr, 32
; GCN-NEXT: .set amdgpu.max_num_sgpr, 34
-attributes #0 = { nounwind noinline "amdgpu-flat-work-group-size"="1,512" "amdgpu-no-workitem-id-x" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
+attributes #0 = { nounwind noinline norecurse "amdgpu-flat-work-group-size"="1,512" "amdgpu-no-workitem-id-x" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
diff --git a/llvm/test/CodeGen/AMDGPU/amdpal-metadata-agpr-register-count.ll b/llvm/test/CodeGen/AMDGPU/amdpal-metadata-agpr-register-count.ll
index 8f4cb364751d88..a01bfc6c8730da 100644
--- a/llvm/test/CodeGen/AMDGPU/amdpal-metadata-agpr-register-count.ll
+++ b/llvm/test/CodeGen/AMDGPU/amdpal-metadata-agpr-register-count.ll
@@ -77,4 +77,4 @@ bb:
; GFX908: agpr_count: 0x20
; GFX908: vgpr_count: 0x20
-attributes #0 = { nounwind noinline "amdgpu-flat-work-group-size"="1,512" }
+attributes #0 = { nounwind noinline norecurse "amdgpu-flat-work-group-size"="1,512" }
diff --git a/llvm/test/CodeGen/AMDGPU/function-resource-usage.ll b/llvm/test/CodeGen/AMDGPU/function-resource-usage.ll
index d3a6b4e01ebfb8..3c57689781378f 100644
--- a/llvm/test/CodeGen/AMDGPU/function-resource-usage.ll
+++ b/llvm/test/CodeGen/AMDGPU/function-resource-usage.ll
@@ -481,6 +481,79 @@ define amdgpu_kernel void @usage_direct_recursion(i32 %n) #0 {
ret void
}
+; GCN-LABEL: {{^}}multi_stage_recurse2:
+; GCN: .set multi_stage_recurse2.num_vgpr, 41
+; GCN: .set multi_stage_recurse2.num_agpr, 0
+; GCN: .set multi_stage_recurse2.numbered_sgpr, 34
+; GCN: .set multi_stage_recurse2.private_seg_size, 16
+; GCN: .set multi_stage_recurse2.uses_vcc, 1
+; GCN: .set multi_stage_recurse2.uses_flat_scratch, 0
+; GCN: .set multi_stage_recurse2.has_dyn_sized_stack, 0
+; GCN: .set multi_stage_recurse2.has_recursion, 1
+; GCN: .set multi_stage_recurse2.has_indirect_call, 0
+; GCN: TotalNumSgprs: 38
+; GCN: NumVgprs: 41
+; GCN: ScratchSize: 16
+; GCN-LABEL: {{^}}multi_stage_recurse1:
+; GCN: .set multi_stage_recurse1.num_vgpr, 41
+; GCN: .set multi_stage_recurse1.num_agpr, 0
+; GCN: .set multi_stage_recurse1.numbered_sgpr, 34
+; GCN: .set multi_stage_recurse1.private_seg_size, 16
+; GCN: .set multi_stage_recurse1.uses_vcc, 1
+; GCN: .set multi_stage_recurse1.uses_flat_scratch, 0
+; GCN: .set multi_stage_recurse1.has_dyn_sized_stack, 0
+; GCN: .set multi_stage_recurse1.has_recursion, 1
+; GCN: .set multi_stage_recurse1.has_indirect_call, 0
+; GCN: TotalNumSgprs: 38
+; GCN: NumVgprs: 41
+; GCN: ScratchSize: 16
+define void @multi_stage_recurse1(i32 %val) #2 {
+ call void @multi_stage_recurse2(i32 %val)
+ ret void
+}
+define void @multi_stage_recurse2(i32 %val) #2 {
+ call void @multi_stage_recurse1(i32 %val)
+ ret void
+}
+
+; GCN-LABEL: {{^}}usage_multi_stage_recurse:
+; GCN: .set usage_multi_stage_recurse.num_vgpr, max(32, multi_stage_recurse1.num_vgpr)
+; GCN: .set usage_multi_stage_recurse.num_agpr, max(0, multi_stage_recurse1.num_agpr)
+; GCN: .set usage_multi_stage_recurse.numbered_sgpr, max(33, multi_stage_recurse1.numbered_sgpr)
+; GCN: .set usage_multi_stage_recurse.private_seg_size, 0+(max(multi_stage_recurse1.private_seg_size))
+; GCN: .set usage_multi_stage_recurse.uses_vcc, or(1, multi_stage_recurse1.uses_vcc)
+; GCN: .set usage_multi_stage_recurse.uses_flat_scratch, or(1, multi_stage_recurse1.uses_flat_scratch)
+; GCN: .set usage_multi_stage_recurse.has_dyn_sized_stack, or(0, multi_stage_recurse1.has_dyn_sized_stack)
+; GCN: .set usage_multi_stage_recurse.has_recursion, or(1, multi_stage_recurse1.has_recursion)
+; GCN: .set usage_multi_stage_recurse.has_indirect_call, or(0, multi_stage_recurse1.has_indirect_call)
+; GCN: TotalNumSgprs: 40
+; GCN: NumVgprs: 41
+; GCN: ScratchSize: 16
+define amdgpu_kernel void @usage_multi_stage_recurse(i32 %n) #0 {
+ call void @multi_stage_recurse1(i32 %n)
+ ret void
+}
+
+; GCN-LABEL: {{^}}multi_call_with_multi_stage_recurse:
+; GCN: .set multi_call_with_multi_stage_recurse.num_vgpr, max(41, use_stack0.num_vgpr, use_stack1.num_vgpr, multi_stage_recurse1.num_vgpr)
+; GCN: .set multi_call_with_multi_stage_recurse.num_agpr, max(0, use_stack0.num_agpr, use_stack1.num_agpr, multi_stage_recurse1.num_agpr)
+; GCN: .set multi_call_with_multi_stage_recurse.numbered_sgpr, max(43, use_stack0.numbered_sgpr, use_stack1.numbered_sgpr, multi_stage_recurse1.numbered_sgpr)
+; GCN: .set multi_call_with_multi_stage_recurse.private_seg_size, 0+(max(use_stack0.private_seg_size, use_stack1.private_seg_size, multi_stage_recurse1.private_seg_size))
+; GCN: .set multi_call_with_multi_stage_recurse.uses_vcc, or(1, use_stack0.uses_vcc, use_stack1.uses_vcc, multi_stage_recurse1.uses_vcc)
+; GCN: .set multi_call_with_multi_stage_recurse.uses_flat_scratch, or(1, use_stack0.uses_flat_scratch, use_stack1.uses_flat_scratch, multi_stage_recurse1.uses_flat_scratch)
+; GCN: .set multi_call_with_multi_stage_recurse.has_dyn_sized_stack, or(0, use_stack0.has_dyn_sized_stack, use_stack1.has_dyn_sized_stack, multi_stage_recurse1.has_dyn_sized_stack)
+; GCN: .set multi_call_with_multi_stage_recurse.has_recursion, or(1, use_stack0.has_recursion, use_stack1.has_recursion, multi_stage_recurse1.has_recursion)
+; GCN: .set multi_call_with_multi_stage_recurse.has_indirect_call, or(0, use_stack0.has_indirect_call, use_stack1.has_indirect_call, multi_stage_recurse1.has_indirect_call)
+; GCN: TotalNumSgprs: 49
+; GCN: NumVgprs: 41
+; GCN: ScratchSize: 2052
+define amdgpu_kernel void @multi_call_with_multi_stage_recurse(i32 %n) #0 {
+ call void @use_stack0()
+ call void @use_stack1()
+ call void @multi_stage_recurse1(i32 %n)
+ ret void
+}
+
; Make sure there's no assert when a sgpr96 is used.
; GCN-LABEL: {{^}}count_use_sgpr96_external_call
; GCN: .set count_use_sgpr96_external_call.num_vgpr, max(32, amdgpu.max_num_vgpr)
diff --git a/llvm/test/CodeGen/AMDGPU/hsa-metadata-agpr-register-count.ll b/llvm/test/CodeGen/AMDGPU/hsa-metadata-agpr-register-count.ll
index 380a8e911e4995..fdc3f3957b828b 100644
--- a/llvm/test/CodeGen/AMDGPU/hsa-metadata-agpr-register-count.ll
+++ b/llvm/test/CodeGen/AMDGPU/hsa-metadata-agpr-register-count.ll
@@ -98,4 +98,4 @@ bb:
ret void
}
-attributes #0 = { nounwind noinline "amdgpu-flat-work-group-size"="1,512" "amdgpu-no-workitem-id-x" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
+attributes #0 = { nounwind noinline norecurse "amdgpu-flat-work-group-size"="1,512" "amdgpu-no-workitem-id-x" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
diff --git a/llvm/test/CodeGen/AMDGPU/ipra.ll b/llvm/test/CodeGen/AMDGPU/ipra.ll
index 957f404c8cdbed..85d23323ec6dec 100644
--- a/llvm/test/CodeGen/AMDGPU/ipra.ll
+++ b/llvm/test/CodeGen/AMDGPU/ipra.ll
@@ -131,6 +131,6 @@ bb:
declare dso_local void @eggs()
-attributes #0 = { nounwind }
-attributes #1 = { nounwind noinline "amdgpu-no-workitem-id-x" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
+attributes #0 = { nounwind norecurse }
+attributes #1 = { nounwind noinline norecurse "amdgpu-no-workitem-id-x" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
attributes #2 = { norecurse }
diff --git a/llvm/test/CodeGen/AMDGPU/recursion.ll b/llvm/test/CodeGen/AMDGPU/recursion.ll
index c0d228e1254e64..479f46faf79d3e 100644
--- a/llvm/test/CodeGen/AMDGPU/recursion.ll
+++ b/llvm/test/CodeGen/AMDGPU/recursion.ll
@@ -41,11 +41,11 @@ define void @tail_recursive_with_stack() {
; For an arbitrary recursive call, report a large number for unknown stack
; usage for code object v4 and older
; CHECK-LABEL: {{^}}calls_recursive:
-; CHECK: .set calls_recursive.private_seg_size, 0+(max(16384, recursive.private_seg_size))
+; CHECK: .set calls_recursive.private_seg_size, 0+(max(16384))
;
; V5-LABEL: {{^}}calls_recursive:
-; V5: .set calls_recursive.private_seg_size, 0+(max(recursive.private_seg_size))
-; V5: .set calls_recursive.has_dyn_sized_stack, or(0, recursive.has_dyn_sized_stack)
+; V5: .set calls_recursive.private_seg_size, 0
+; V5: .set calls_recursive.has_dyn_sized_stack, 0
define amdgpu_kernel void @calls_recursive() {
call void @recursive()
ret void
@@ -65,22 +65,22 @@ define amdgpu_kernel void @kernel_indirectly_calls_tail_recursive() {
; in the kernel.
; CHECK-LABEL: {{^}}kernel_calls_tail_recursive:
-; CHECK: .set kernel_calls_tail_recursive.private_seg_size, 0+(max(16384, tail_recursive.private_seg_size))
+; CHECK: .set kernel_calls_tail_recursive.private_seg_size, 0+(max(16384))
;
; V5-LABEL: {{^}}kernel_calls_tail_recursive:
-; V5: .set kernel_calls_tail_recursive.private_seg_size, 0+(max(tail_recursive.private_seg_size))
-; V5: .set kernel_calls_tail_recursive.has_recursion, or(1, tail_recursive.has_recursion)
+; V5: .set kernel_calls_tail_recursive.private_seg_size, 0
+; V5: .set kernel_calls_tail_recursive.has_recursion, 1
define amdgpu_kernel void @kernel_calls_tail_recursive() {
call void @tail_recursive()
ret void
}
; CHECK-LABEL: {{^}}kernel_calls_tail_recursive_with_stack:
-; CHECK: .set kernel_calls_tail_recursive_with_stack.private_seg_size, 0+(max(16384, tail_recursive_with_stack.private_seg_size))
+; CHECK: .set kernel_calls_tail_recursive_with_stack.private_seg_size, 0+(max(16384))
;
; V5-LABEL: {{^}}kernel_calls_tail_recursive_with_stack:
-; V5: .set kernel_calls_tail_recursive_with_stack.private_seg_size, 0+(max(tail_recursive_with_stack.private_seg_size))
-; V5: .set kernel_calls_tail_recursive_with_stack.has_dyn_sized_stack, or(0, tail_recursive_with_stack.has_dyn_sized_stack)
+; V5: .set kernel_calls_tail_recursive_with_stack.private_seg_size, 0
+; V5: .set kernel_calls_tail_recursive_with_stack.has_dyn_sized_stack, 0
define amdgpu_kernel void @kernel_calls_tail_recursive_with_stack() {
call void @tail_recursive_with_stack()
ret void
diff --git a/llvm/test/CodeGen/AMDGPU/recursive-mcexpr.ll b/llvm/test/CodeGen/AMDGPU/recursive-mcexpr.ll
new file mode 100644
index 00000000000000..630a0923e31287
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/recursive-mcexpr.ll
@@ -0,0 +1,85 @@
+; RUN: llc -mtriple=amdgcn-amd-amdhsa -verify-machineinstrs -mcpu=gfx90a < %s | FileCheck %s
+
+; CHECK-LABEL: {{^}}qux
+; CHECK: .set qux.num_vgpr, 41
+; CHECK: .set qux.num_agpr, 0
+; CHECK: .set qux.numbered_sgpr, 34
+; CHECK: .set qux.private_seg_size, 16
+; CHECK: .set qux.uses_vcc, 1
+; CHECK: .set qux.uses_flat_scratch, 0
+; CHECK: .set qux.has_dyn_sized_stack, 0
+; CHECK: .set qux.has_recursion, 1
+; CHECK: .set qux.has_indirect_call, 0
+
+; CHECK-LABEL: {{^}}baz
+; CHECK: .set baz.num_vgpr, 42
+; CHECK: .set baz.num_agpr, 0
+; CHECK: .set baz.numbered_sgpr, 34
+; CHECK: .set baz.private_seg_size, 16
+; CHECK: .set baz.uses_vcc, 1
+; CHECK: .set baz.uses_flat_scratch, 0
+; CHECK: .set baz.has_dyn_sized_stack, 0
+; CHECK: .set baz.has_recursion, 1
+; CHECK: .set baz.has_indirect_call, 0
+
+; CHECK-LABEL: {{^}}bar
+; CHECK: .set bar.num_vgpr, 42
+; CHECK: .set bar.num_agpr, 0
+; CHECK: .set bar.numbered_sgpr, 34
+; CHECK: .set bar.private_seg_size, 16
+; CHECK: .set bar.uses_vcc, 1
+; CHECK: .set bar.uses_flat_scratch, 0
+; CHECK: .set bar.has_dyn_sized_stack, 0
+; CHECK: .set bar.has_recursion, 1
+; CHECK: .set bar.has_indirect_call, 0
+
+; CHECK-LABEL: {{^}}foo
+; CHECK: .set foo.num_vgpr, 42
+; CHECK: .set foo.num_agpr, 0
+; CHECK: .set foo.numbered_sgpr, 34
+; CHECK: .set foo.private_seg_size, 16
+; CHECK: .set foo.uses_vcc, 1
+; CHECK: .set foo.uses_flat_scratch, 0
+; CHECK: .set foo.has_dyn_sized_stack, 0
+; CHECK: .set foo.has_recursion, 1
+; CHECK: .set foo.has_indirect_call, 0
+
+define void @foo() {
+entry:
+ call void @bar()
+ ret void
+}
+
+define void @bar() {
+entry:
+ call void @baz()
+ ret void
+}
+
+define void @baz() {
+entry:
+ call void @qux()
+ ret void
+}
+
+define void @qux() {
+entry:
+ call void @foo()
+ ret void
+}
+
+; CHECK-LABEL: {{^}}usefoo
+; CHECK: .set usefoo.num_vgpr, 32
+; CHECK: .set usefoo.num_agpr, 0
+; CHECK: .set usefoo.numbered_sgpr, 33
+; CHECK: .set usefoo.private_seg_size, 0
+; CHECK: .set usefoo.uses_vcc, 1
+; CHECK: .set usefoo.uses_flat_scratch, 1
+; CHECK: .set usefoo.has_dyn_sized_stack, 0
+; CHECK: .set usefoo.has_recursion, 1
+; CHECK: .set usefoo.has_indirect_call, 0
+define amdgpu_kernel void @usefoo() {
+ call void @foo()
+ ret void
+}
+
diff --git a/llvm/test/CodeGen/AMDGPU/sibling-call.ll b/llvm/test/CodeGen/AMDGPU/sibling-call.ll
index 5536a09538e6ee..4639bf4a678d49 100644
--- a/llvm/test/CodeGen/AMDGPU/sibling-call.ll
+++ b/llvm/test/CodeGen/AMDGPU/sibling-call.ll
@@ -468,5 +468,5 @@ entry:
ret <2 x i64> %ret
}
-attributes #0 = { nounwind }
-attributes #1 = { nounwind noinline "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-heap-ptr" "amdgpu-no-hostcall-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-lds-kernel-id" "amdgpu-no-multigrid-sync-arg" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-workitem-id-x" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
+attributes #0 = { nounwind norecurse }
+attributes #1 = { nounwind noinline norecurse "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-heap-ptr" "amdgpu-no-hostcall-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-lds-kernel-id" "amdgpu-no-multigrid-sync-arg" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-workitem-id-x" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
|
…ses of the symbol we're trying to define
Took a different approach since the dependence on the norecurse attribute wasn't sitting too well with me (extra burden on new tests, still possibility of crashing on norecurse with recursing functions). Do let me know if I'm missing some low hanging fruit in terms of avoiding recursive def/use with MCSymbols. |
return findSymbolInExpr(Sym, BExpr->getLHS(), Exprs) || | ||
findSymbolInExpr(Sym, BExpr->getRHS(), Exprs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have the worklist, but aren't using it to avoid the recursive call? Couldn't this also add duplicate entries?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct me if my interpretation of the comment is wrong but AFAIK, it cannot recurse locally within findSymbolInExpr
as Exprs
is only populated with Symbols that are assigned an MCExpr (which is what we're in the middle of doing for MCSymbol Sym
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually if you have to deal with recursive expressions, you track a visited set while walking through the expression, and give up when you encounter already visited entries. You're looking at all the visited expressions each time you visit an expression.
As an example, here's a similar situation for ConstantExprs:
uint8_t getConstantAccess(const Constant *C, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume the getConstantAccess
example walks over an already completed expression. My approach here is more preventative in terms of recursion so no recursion should exist, at all. I've added a Visited set but more so to ensure the invariant of no recursion to be upheld + added more comments that hopefully explain that this walk is to avoid recursion rather than detect and remove recursion.
if (!foundRecursiveSymbolDef( | ||
Sym, CalleeValSym->getVariableValue(/*isUsed=*/false))) { | ||
ArgExprs.push_back(MCSymbolRefExpr::create(CalleeValSym, OutContext)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you track a visited set instead of explicitly checking for recursion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may not understand what's suggested with a visited set but I've attempted a visit set using the Seen
set above but that can't consider the case where there's multi function recursion since only single-level callees are being covered by Seen
. If using a visit set with walking over the expression, it may return true with cases where a caller isn't on the recursive path where I'd like to find cases where the currently assigned Function's symbol is somewhere in the uses of its callees.
// The (partially complete) expression should have no recursion in it. After | ||
// all, we're trying to avoid recursion using this codepath. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Document return value
ArgExprs.push_back(MCSymbolRefExpr::create(CalleeValSym, OutContext)); | ||
} | ||
} else { | ||
ArgExprs.push_back(MCSymbolRefExpr::create(CalleeValSym, OutContext)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are 2 paths to this push_back, so merge the push_back into one condition
ArgExprs.push_back( | ||
MCSymbolRefExpr::create(calleeValSym, OutContext)); | ||
} | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same, can turn this into one conditional push_back?
SmallPtrSetImpl<const MCExpr *> &Visited) { | ||
// Assert if any of the expressions is already visited (i.e., there is | ||
// existing recursion). | ||
if (!Visited.insert(Expr).second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the advantage of this over assert
? Is it because llvm_unreachable
can be enabled separately?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't require the annoying variable only used in asserts builds problem
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the compiler will optimize out the unused variable if llvm_unreachable
is a no-op. Guess it's
if (!Visited.insert(Expr).second)
llvm_unreachable("already visited expression");
vs
[[maybe_unused]] bool AlreadyVisited = Visited.insert(Expr).second;
assert(!AlreadyVisited && "already visited expression");
Roughly equivalent so I guess it's up to taste.
✅ With the latest revision this PR passed the C/C++ code formatter. |
Fix clang-format error |
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/123/builds/7371 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/30/builds/8082 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/73/builds/6942 Here is the relevant piece of the build log for the reference
|
…tiple functions I was wrong last patch. I viewed the visited set purely as a possible recursion deterrent where functions calling a callee multiple times are handled elsewhere. This wouldn't consider cases where a function is called multiple times by different callers still part of the same call graph. New test shows the aforementioned case. Reapplies llvm#111004
…nctions (llvm#111004) Avoid constructing recursive MCExpr definitions when multiple functions cause a recursion. Fixes llvm#110863
…tiple functions" (llvm#112013) Reverts llvm#111004
…tiple functions (#112251) I was wrong last patch. I viewed the `Visited` set purely as a possible recursion deterrent where functions calling a callee multiple times are handled elsewhere. This doesn't consider cases where a function is called multiple times by different callers still part of the same call graph. New test shows the aforementioned case. Reapplies #111004, fixes #115562.
…ass (llvm#102913) Converts AMDGPUResourceUsageAnalysis pass from Module to MachineFunction pass. Moves function resource info propagation to to MC layer (through helpers in AMDGPUMCResourceInfo) by generating MCExprs for every function resource which the emitters have been prepped for. Fixes llvm#64863 [AMDGPU] Fix stack size metadata for functions with direct and indirect calls (llvm#110828) When a function has an external call, it should still use the stack sizes of direct, known, calls to calculate its own stack size [AMDGPU] Fix resource usage information for unnamed functions (llvm#115320) Resource usage information would try to overwrite unnamed functions if there are multiple within the same compilation unit. This aims to either use the `MCSymbol` assigned to the unnamed function (i.e., `CurrentFnSym`), or, rematerialize the `MCSymbol` for the unnamed function. Reapply [AMDGPU] Avoid resource propagation for recursion through multiple functions (llvm#112251) I was wrong last patch. I viewed the `Visited` set purely as a possible recursion deterrent where functions calling a callee multiple times are handled elsewhere. This doesn't consider cases where a function is called multiple times by different callers still part of the same call graph. New test shows the aforementioned case. Reapplies llvm#111004, fixes llvm#115562. [AMDGPU] Newly added test modified for recent SGPR use change (llvm#116427) Mistimed rebase for llvm#112251 which added new tests which did not consider the changes introduced in llvm#112403 yet Change-Id: I4dfe6a1b679137e080a6d2b44016347ea704b014
Avoid constructing recursive MCExpr definitions when multiple functions cause a recursion.
Fixes #110863