Skip to content

[AArch64] Do not split bfloat HFA args between regs and stack #128909

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Feb 27, 2025

Conversation

ostannard
Copy link
Collaborator

In AAPCS64, __fp16 and __bf16 share the same machine type, so they should be treated the same way for argument passing. In particular, arrays of them need to be treated as homogeneous aggregates, and not split between registers and the stack.

In AAPCS64, __fp16 and __bf16 share the same machine type, so they
should be treated the same way for argument passing. In particular,
arrays of them need to be treated as homogeneous aggregates, and not
split between registers and the stack.
@llvmbot
Copy link
Member

llvmbot commented Feb 26, 2025

@llvm/pr-subscribers-backend-aarch64

Author: Oliver Stannard (ostannard)

Changes

In AAPCS64, __fp16 and __bf16 share the same machine type, so they should be treated the same way for argument passing. In particular, arrays of them need to be treated as homogeneous aggregates, and not split between registers and the stack.


Full diff: https://github.com/llvm/llvm-project/pull/128909.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64CallingConvention.cpp (+1-1)
  • (modified) llvm/test/CodeGen/AArch64/argument-blocks.ll (+7)
diff --git a/llvm/lib/Target/AArch64/AArch64CallingConvention.cpp b/llvm/lib/Target/AArch64/AArch64CallingConvention.cpp
index 991d710c979b9..787a1a83613c9 100644
--- a/llvm/lib/Target/AArch64/AArch64CallingConvention.cpp
+++ b/llvm/lib/Target/AArch64/AArch64CallingConvention.cpp
@@ -142,7 +142,7 @@ static bool CC_AArch64_Custom_Block(unsigned &ValNo, MVT &ValVT, MVT &LocVT,
   ArrayRef<MCPhysReg> RegList;
   if (LocVT.SimpleTy == MVT::i64 || (IsDarwinILP32 && LocVT.SimpleTy == MVT::i32))
     RegList = XRegList;
-  else if (LocVT.SimpleTy == MVT::f16)
+  else if (LocVT.SimpleTy == MVT::f16 || LocVT.SimpleTy == MVT::bf16)
     RegList = HRegList;
   else if (LocVT.SimpleTy == MVT::f32 || LocVT.is32BitVector())
     RegList = SRegList;
diff --git a/llvm/test/CodeGen/AArch64/argument-blocks.ll b/llvm/test/CodeGen/AArch64/argument-blocks.ll
index b5374ca8ced53..8cef28a38970d 100644
--- a/llvm/test/CodeGen/AArch64/argument-blocks.ll
+++ b/llvm/test/CodeGen/AArch64/argument-blocks.ll
@@ -195,3 +195,10 @@ define half @test_f16_blocked([7 x double], [2 x half] %in) {
   %val = extractvalue [2 x half] %in, 0
   ret half %val
 }
+
+define bfloat @test_bf16_blocked([7 x double], [2 x bfloat] %in) {
+; CHECK-LABEL: test_bf16_blocked:
+; CHECK: ldr h0, [sp]
+  %val = extractvalue [2 x bfloat] %in, 0
+  ret bfloat %val
+}

Copy link
Collaborator

@davemgreen davemgreen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ostannard ostannard merged commit fd534e5 into llvm:main Feb 27, 2025
9 of 11 checks passed
joaosaffran pushed a commit to joaosaffran/llvm-project that referenced this pull request Mar 3, 2025
…28909)

In AAPCS64, __fp16 and __bf16 share the same machine type, so they
should be treated the same way for argument passing. In particular,
arrays of them need to be treated as homogeneous aggregates, and not
split between registers and the stack.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants