Skip to content

[HLSL] Treat classes and structs as packed by default #137391

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

bogner
Copy link
Contributor

@bogner bogner commented Apr 25, 2025

Fixes #121010.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" HLSL HLSL Language Support labels Apr 25, 2025
@llvmbot
Copy link
Member

llvmbot commented Apr 25, 2025

@llvm/pr-subscribers-clang

Author: Justin Bogner (bogner)

Changes

Fixes #121010.


Patch is 133.63 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/137391.diff

25 Files Affected:

  • (modified) clang/lib/Sema/SemaDecl.cpp (+4)
  • (modified) clang/lib/Sema/SemaTemplate.cpp (+4)
  • (modified) clang/test/CodeGenHLSL/ArrayAssignable.hlsl (+3-3)
  • (modified) clang/test/CodeGenHLSL/ArrayTemporary.hlsl (+3-3)
  • (modified) clang/test/CodeGenHLSL/BasicFeatures/AggregateSplatCast.hlsl (+2-2)
  • (modified) clang/test/CodeGenHLSL/BasicFeatures/ArrayElementwiseCast.hlsl (+4-5)
  • (modified) clang/test/CodeGenHLSL/BasicFeatures/InitLists.hlsl (+232-252)
  • (modified) clang/test/CodeGenHLSL/BasicFeatures/OutputArguments.hlsl (+11-17)
  • (modified) clang/test/CodeGenHLSL/BasicFeatures/StructElementwiseCast.hlsl (+17-17)
  • (modified) clang/test/CodeGenHLSL/BasicFeatures/VectorElementwiseCast.hlsl (+8-8)
  • (modified) clang/test/CodeGenHLSL/BoolVector.hlsl (+9-9)
  • (modified) clang/test/CodeGenHLSL/builtins/AppendStructuredBuffer-elementtype.hlsl (+1-1)
  • (modified) clang/test/CodeGenHLSL/builtins/ConsumeStructuredBuffer-elementtype.hlsl (+1-1)
  • (modified) clang/test/CodeGenHLSL/builtins/RasterizerOrderedStructuredBuffer-elementtype.hlsl (+1-1)
  • (modified) clang/test/CodeGenHLSL/builtins/StructuredBuffers-subscripts.hlsl (+6-6)
  • (modified) clang/test/CodeGenHLSL/builtins/hlsl_resource_t.hlsl (+1-1)
  • (modified) clang/test/CodeGenHLSL/cbuffer.hlsl (+12-12)
  • (modified) clang/test/CodeGenHLSL/cbuffer_and_namespaces.hlsl (+1-1)
  • (modified) clang/test/CodeGenHLSL/default_cbuffer_with_layout.hlsl (+1-1)
  • (modified) clang/test/CodeGenHLSL/implicit-norecurse-attrib.hlsl (+2-2)
  • (modified) clang/test/CodeGenHLSL/sret_output.hlsl (+1-1)
  • (modified) clang/test/CodeGenHLSL/this-assignment-overload.hlsl (+13-13)
  • (modified) clang/test/CodeGenHLSL/this-assignment.hlsl (+8-8)
  • (modified) clang/test/CodeGenHLSL/this-reference.hlsl (+3-3)
  • (added) clang/test/SemaHLSL/PackedStruct.hlsl (+13)
diff --git a/clang/lib/Sema/SemaDecl.cpp b/clang/lib/Sema/SemaDecl.cpp
index fe61b92e087d7..411ece25b4344 100644
--- a/clang/lib/Sema/SemaDecl.cpp
+++ b/clang/lib/Sema/SemaDecl.cpp
@@ -17567,6 +17567,8 @@ Sema::ActOnTag(Scope *S, unsigned TagSpec, TagUseKind TUK, SourceLocation KWLoc,
       // parsing of the struct).
       if (TUK == TagUseKind::Definition &&
           (!SkipBody || !SkipBody->ShouldSkip)) {
+        if (LangOpts.HLSL)
+          RD->addAttr(PackedAttr::CreateImplicit(Context));
         AddAlignmentAttributesForRecord(RD);
         AddMsStructLayoutForRecord(RD);
       }
@@ -18257,6 +18259,8 @@ Sema::ActOnTag(Scope *S, unsigned TagSpec, TagUseKind TUK, SourceLocation KWLoc,
     // the #pragma tokens are effectively skipped over during the
     // parsing of the struct).
     if (TUK == TagUseKind::Definition && (!SkipBody || !SkipBody->ShouldSkip)) {
+      if (LangOpts.HLSL)
+        RD->addAttr(PackedAttr::CreateImplicit(Context));
       AddAlignmentAttributesForRecord(RD);
       AddMsStructLayoutForRecord(RD);
     }
diff --git a/clang/lib/Sema/SemaTemplate.cpp b/clang/lib/Sema/SemaTemplate.cpp
index 894f072d84989..e5c3fc536a281 100644
--- a/clang/lib/Sema/SemaTemplate.cpp
+++ b/clang/lib/Sema/SemaTemplate.cpp
@@ -2116,6 +2116,8 @@ DeclResult Sema::CheckClassTemplate(
   // Add alignment attributes if necessary; these attributes are checked when
   // the ASTContext lays out the structure.
   if (TUK == TagUseKind::Definition && (!SkipBody || !SkipBody->ShouldSkip)) {
+    if (LangOpts.HLSL)
+      NewClass->addAttr(PackedAttr::CreateImplicit(Context));
     AddAlignmentAttributesForRecord(NewClass);
     AddMsStructLayoutForRecord(NewClass);
   }
@@ -8655,6 +8657,8 @@ DeclResult Sema::ActOnClassTemplateSpecialization(
   // Add alignment attributes if necessary; these attributes are checked when
   // the ASTContext lays out the structure.
   if (TUK == TagUseKind::Definition && (!SkipBody || !SkipBody->ShouldSkip)) {
+    if (LangOpts.HLSL)
+      Specialization->addAttr(PackedAttr::CreateImplicit(Context));
     AddAlignmentAttributesForRecord(Specialization);
     AddMsStructLayoutForRecord(Specialization);
   }
diff --git a/clang/test/CodeGenHLSL/ArrayAssignable.hlsl b/clang/test/CodeGenHLSL/ArrayAssignable.hlsl
index 6374f91230546..c3204570d6ef3 100644
--- a/clang/test/CodeGenHLSL/ArrayAssignable.hlsl
+++ b/clang/test/CodeGenHLSL/ArrayAssignable.hlsl
@@ -10,7 +10,7 @@ struct S {
 // CHECK: @c1 = external addrspace(2) global [2 x float], align 4
 // CHECK: @c2 = external addrspace(2) global [2 x <4 x i32>], align 16
 // CHECK: @c3 = external addrspace(2) global [2 x [2 x i32]], align 4
-// CHECK: @c4 = external addrspace(2) global [1 x target("dx.Layout", %S, 8, 0, 4)], align 4
+// CHECK: @c4 = external addrspace(2) global [1 x target("dx.Layout", %S, 8, 0, 4)], align 1
 
 cbuffer CBArrays : register(b0) {
   float c1[2];
@@ -169,8 +169,8 @@ void arr_assign10() {
 }
 
 // CHECK-LABEL: define void {{.*}}arr_assign11
-// CHECK: [[C:%.*]] = alloca [1 x %struct.S], align 4
-// CHECK: call void @llvm.memcpy.p0.p2.i32(ptr align 4 [[C]], ptr addrspace(2) align 4 @c4, i32 8, i1 false)
+// CHECK: [[C:%.*]] = alloca [1 x %struct.S], align 1
+// CHECK: call void @llvm.memcpy.p0.p2.i32(ptr align 1 [[C]], ptr addrspace(2) align 1 @c4, i32 8, i1 false)
 // CHECK-NEXT: ret void
 void arr_assign11() {
   S s = {1, 2.0};
diff --git a/clang/test/CodeGenHLSL/ArrayTemporary.hlsl b/clang/test/CodeGenHLSL/ArrayTemporary.hlsl
index 91a283554459d..29ea896045bb1 100644
--- a/clang/test/CodeGenHLSL/ArrayTemporary.hlsl
+++ b/clang/test/CodeGenHLSL/ArrayTemporary.hlsl
@@ -24,9 +24,9 @@ void fn2(Obj O[4]) { }
 // CHECK-LABEL: define void {{.*}}call2{{.*}}
 // CHECK: [[Arr:%.*]] = alloca [4 x %struct.Obj]
 // CHECK: [[Tmp:%.*]] = alloca [4 x %struct.Obj]
-// CHECK: call void @llvm.memset.p0.i32(ptr align 4 [[Arr]], i8 0, i32 32, i1 false)
-// CHECK: call void @llvm.memcpy.p0.p0.i32(ptr align 4 [[Tmp]], ptr align 4 [[Arr]], i32 32, i1 false)
-// CHECK: call void {{.*}}fn2{{.*}}(ptr noundef byval([4 x %struct.Obj]) align 4 [[Tmp]])
+// CHECK: call void @llvm.memset.p0.i32(ptr align 1 [[Arr]], i8 0, i32 32, i1 false)
+// CHECK: call void @llvm.memcpy.p0.p0.i32(ptr align 1 [[Tmp]], ptr align 1 [[Arr]], i32 32, i1 false)
+// CHECK: call void {{.*}}fn2{{.*}}(ptr noundef byval([4 x %struct.Obj]) align 1 [[Tmp]])
 void call2() {
   Obj Arr[4] = {{0, 0}, {0, 0}, {0, 0}, {0, 0}};
   fn2(Arr);
diff --git a/clang/test/CodeGenHLSL/BasicFeatures/AggregateSplatCast.hlsl b/clang/test/CodeGenHLSL/BasicFeatures/AggregateSplatCast.hlsl
index 42b6abec1b3d8..512fcd435191a 100644
--- a/clang/test/CodeGenHLSL/BasicFeatures/AggregateSplatCast.hlsl
+++ b/clang/test/CodeGenHLSL/BasicFeatures/AggregateSplatCast.hlsl
@@ -55,7 +55,7 @@ struct S {
 // struct splats
 // CHECK-LABEL: define void {{.*}}call3
 // CHECK: [[A:%.*]] = alloca <1 x i32>, align 4
-// CHECK: [[s:%.*]] = alloca %struct.S, align 4
+// CHECK: [[s:%.*]] = alloca %struct.S, align 1
 // CHECK-NEXT: store <1 x i32> splat (i32 1), ptr [[A]], align 4
 // CHECK-NEXT: [[L:%.*]] = load <1 x i32>, ptr [[A]], align 4
 // CHECK-NEXT: [[VL:%.*]] = extractelement <1 x i32> [[L]], i32 0
@@ -72,7 +72,7 @@ export void call3() {
 // struct splat from vector of length 1
 // CHECK-LABEL: define void {{.*}}call5
 // CHECK: [[A:%.*]] = alloca <1 x i32>, align 4
-// CHECK-NEXT: [[s:%.*]] = alloca %struct.S, align 4
+// CHECK-NEXT: [[s:%.*]] = alloca %struct.S, align 1
 // CHECK-NEXT: store <1 x i32> splat (i32 1), ptr [[A]], align 4
 // CHECK-NEXT: [[L:%.*]] = load <1 x i32>, ptr [[A]], align 4
 // CHECK-NEXT: [[VL:%.*]] = extractelement <1 x i32> [[L]], i32 0
diff --git a/clang/test/CodeGenHLSL/BasicFeatures/ArrayElementwiseCast.hlsl b/clang/test/CodeGenHLSL/BasicFeatures/ArrayElementwiseCast.hlsl
index 18f82bff3b308..ac02ddf5765ed 100644
--- a/clang/test/CodeGenHLSL/BasicFeatures/ArrayElementwiseCast.hlsl
+++ b/clang/test/CodeGenHLSL/BasicFeatures/ArrayElementwiseCast.hlsl
@@ -125,12 +125,12 @@ struct S {
 
 // flatten and truncate from a struct
 // CHECK-LABEL: define void {{.*}}call7
-// CHECK: [[s:%.*]] = alloca %struct.S, align 4
+// CHECK: [[s:%.*]] = alloca %struct.S, align 1
 // CHECK-NEXT: [[A:%.*]] = alloca [1 x i32], align 4
-// CHECK-NEXT: [[Tmp:%.*]] = alloca %struct.S, align 4
-// CHECK-NEXT: call void @llvm.memcpy.p0.p0.i32(ptr align 4 [[s]], ptr align 4 {{.*}}, i32 8, i1 false)
+// CHECK-NEXT: [[Tmp:%.*]] = alloca %struct.S, align 1
+// CHECK-NEXT: call void @llvm.memcpy.p0.p0.i32(ptr align 1 [[s]], ptr align 1 {{.*}}, i32 8, i1 false)
 // CHECK-NEXT: call void @llvm.memcpy.p0.p0.i32(ptr align 4 [[A]], ptr align 4 {{.*}}, i32 4, i1 false)
-// CHECK-NEXT: call void @llvm.memcpy.p0.p0.i32(ptr align 4 [[Tmp]], ptr align 4 [[s]], i32 8, i1 false)
+// CHECK-NEXT: call void @llvm.memcpy.p0.p0.i32(ptr align 1 [[Tmp]], ptr align 1 [[s]], i32 8, i1 false)
 // CHECK-NEXT: [[G1:%.*]] = getelementptr inbounds [1 x i32], ptr [[A]], i32 0, i32 0
 // CHECK-NEXT: [[G2:%.*]] = getelementptr inbounds %struct.S, ptr [[Tmp]], i32 0, i32 0
 // CHECK-NEXT: [[G3:%.*]] = getelementptr inbounds %struct.S, ptr [[Tmp]], i32 0, i32 1
@@ -141,4 +141,3 @@ export void call7() {
   int A[1] = {1};
   A = (int[1])s;
 }
-
diff --git a/clang/test/CodeGenHLSL/BasicFeatures/InitLists.hlsl b/clang/test/CodeGenHLSL/BasicFeatures/InitLists.hlsl
index a0590162c7087..d04583e4fc51a 100644
--- a/clang/test/CodeGenHLSL/BasicFeatures/InitLists.hlsl
+++ b/clang/test/CodeGenHLSL/BasicFeatures/InitLists.hlsl
@@ -47,9 +47,9 @@ struct SlicyBits {
 
 // Case 1: Extraneous braces get ignored in literal instantiation.
 // CHECK-LABEL: define void @_Z5case1v(
-// CHECK-SAME: ptr dead_on_unwind noalias writable sret([[STRUCT_TWOFLOATS:%.*]]) align 4 [[AGG_RESULT:%.*]]) #[[ATTR0:[0-9]+]] {
+// CHECK-SAME: ptr dead_on_unwind noalias writable sret([[STRUCT_TWOFLOATS:%.*]]) align 1 [[AGG_RESULT:%.*]]) #[[ATTR0:[0-9]+]] {
 // CHECK-NEXT:  [[ENTRY:.*:]]
-// CHECK-NEXT:    call void @llvm.memcpy.p0.p0.i32(ptr align 4 [[AGG_RESULT]], ptr align 4 @__const._Z5case1v.TF1, i32 8, i1 false)
+// CHECK-NEXT:    call void @llvm.memcpy.p0.p0.i32(ptr align 1 [[AGG_RESULT]], ptr align 1 @__const._Z5case1v.TF1, i32 8, i1 false)
 // CHECK-NEXT:    ret void
 //
 TwoFloats case1() {
@@ -59,9 +59,9 @@ TwoFloats case1() {
 
 // Case 2: Valid C/C++ initializer is handled appropriately.
 // CHECK-LABEL: define void @_Z5case2v(
-// CHECK-SAME: ptr dead_on_unwind noalias writable sret([[STRUCT_TWOFLOATS:%.*]]) align 4 [[AGG_RESULT:%.*]]) #[[ATTR0]] {
+// CHECK-SAME: ptr dead_on_unwind noalias writable sret([[STRUCT_TWOFLOATS:%.*]]) align 1 [[AGG_RESULT:%.*]]) #[[ATTR0]] {
 // CHECK-NEXT:  [[ENTRY:.*:]]
-// CHECK-NEXT:    call void @llvm.memcpy.p0.p0.i32(ptr align 4 [[AGG_RESULT]], ptr align 4 @__const._Z5case2v.TF2, i32 8, i1 false)
+// CHECK-NEXT:    call void @llvm.memcpy.p0.p0.i32(ptr align 1 [[AGG_RESULT]], ptr align 1 @__const._Z5case2v.TF2, i32 8, i1 false)
 // CHECK-NEXT:    ret void
 //
 TwoFloats case2() {
@@ -71,16 +71,16 @@ TwoFloats case2() {
 
 // Case 3: Simple initialization with conversion of an argument.
 // CHECK-LABEL: define void @_Z5case3i(
-// CHECK-SAME: ptr dead_on_unwind noalias writable sret([[STRUCT_TWOFLOATS:%.*]]) align 4 [[AGG_RESULT:%.*]], i32 noundef [[VAL:%.*]]) #[[ATTR0]] {
+// CHECK-SAME: ptr dead_on_unwind noalias writable sret([[STRUCT_TWOFLOATS:%.*]]) align 1 [[AGG_RESULT:%.*]], i32 noundef [[VAL:%.*]]) #[[ATTR0]] {
 // CHECK-NEXT:  [[ENTRY:.*:]]
 // CHECK-NEXT:    [[VAL_ADDR:%.*]] = alloca i32, align 4
 // CHECK-NEXT:    store i32 [[VAL]], ptr [[VAL_ADDR]], align 4
 // CHECK-NEXT:    [[X:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOFLOATS]], ptr [[AGG_RESULT]], i32 0, i32 0
 // CHECK-NEXT:    [[TMP0:%.*]] = load i32, ptr [[VAL_ADDR]], align 4
 // CHECK-NEXT:    [[CONV:%.*]] = sitofp i32 [[TMP0]] to float
-// CHECK-NEXT:    store float [[CONV]], ptr [[X]], align 4
+// CHECK-NEXT:    store float [[CONV]], ptr [[X]], align 1
 // CHECK-NEXT:    [[Y:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOFLOATS]], ptr [[AGG_RESULT]], i32 0, i32 1
-// CHECK-NEXT:    store float 2.000000e+00, ptr [[Y]], align 4
+// CHECK-NEXT:    store float 2.000000e+00, ptr [[Y]], align 1
 // CHECK-NEXT:    ret void
 //
 TwoFloats case3(int Val) {
@@ -91,7 +91,7 @@ TwoFloats case3(int Val) {
 // Case 4: Initialization from a scalarized vector into a structure with element
 // conversions.
 // CHECK-LABEL: define void @_Z5case4Dv2_i(
-// CHECK-SAME: ptr dead_on_unwind noalias writable sret([[STRUCT_TWOFLOATS:%.*]]) align 4 [[AGG_RESULT:%.*]], <2 x i32> noundef [[TWOVALS:%.*]]) #[[ATTR0]] {
+// CHECK-SAME: ptr dead_on_unwind noalias writable sret([[STRUCT_TWOFLOATS:%.*]]) align 1 [[AGG_RESULT:%.*]], <2 x i32> noundef [[TWOVALS:%.*]]) #[[ATTR0]] {
 // CHECK-NEXT:  [[ENTRY:.*:]]
 // CHECK-NEXT:    [[TWOVALS_ADDR:%.*]] = alloca <2 x i32>, align 8
 // CHECK-NEXT:    store <2 x i32> [[TWOVALS]], ptr [[TWOVALS_ADDR]], align 8
@@ -99,12 +99,12 @@ TwoFloats case3(int Val) {
 // CHECK-NEXT:    [[TMP0:%.*]] = load <2 x i32>, ptr [[TWOVALS_ADDR]], align 8
 // CHECK-NEXT:    [[VECEXT:%.*]] = extractelement <2 x i32> [[TMP0]], i64 0
 // CHECK-NEXT:    [[CONV:%.*]] = sitofp i32 [[VECEXT]] to float
-// CHECK-NEXT:    store float [[CONV]], ptr [[X]], align 4
+// CHECK-NEXT:    store float [[CONV]], ptr [[X]], align 1
 // CHECK-NEXT:    [[Y:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOFLOATS]], ptr [[AGG_RESULT]], i32 0, i32 1
 // CHECK-NEXT:    [[TMP1:%.*]] = load <2 x i32>, ptr [[TWOVALS_ADDR]], align 8
 // CHECK-NEXT:    [[VECEXT1:%.*]] = extractelement <2 x i32> [[TMP1]], i64 1
 // CHECK-NEXT:    [[CONV2:%.*]] = sitofp i32 [[VECEXT1]] to float
-// CHECK-NEXT:    store float [[CONV2]], ptr [[Y]], align 4
+// CHECK-NEXT:    store float [[CONV2]], ptr [[Y]], align 1
 // CHECK-NEXT:    ret void
 //
 TwoFloats case4(int2 TwoVals) {
@@ -114,18 +114,18 @@ TwoFloats case4(int2 TwoVals) {
 
 // Case 5: Initialization from a scalarized vector of matching type.
 // CHECK-LABEL: define void @_Z5case5Dv2_i(
-// CHECK-SAME: ptr dead_on_unwind noalias writable sret([[STRUCT_TWOINTS:%.*]]) align 4 [[AGG_RESULT:%.*]], <2 x i32> noundef [[TWOVALS:%.*]]) #[[ATTR0]] {
+// CHECK-SAME: ptr dead_on_unwind noalias writable sret([[STRUCT_TWOINTS:%.*]]) align 1 [[AGG_RESULT:%.*]], <2 x i32> noundef [[TWOVALS:%.*]]) #[[ATTR0]] {
 // CHECK-NEXT:  [[ENTRY:.*:]]
 // CHECK-NEXT:    [[TWOVALS_ADDR:%.*]] = alloca <2 x i32>, align 8
 // CHECK-NEXT:    store <2 x i32> [[TWOVALS]], ptr [[TWOVALS_ADDR]], align 8
 // CHECK-NEXT:    [[Z:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOINTS]], ptr [[AGG_RESULT]], i32 0, i32 0
 // CHECK-NEXT:    [[TMP0:%.*]] = load <2 x i32>, ptr [[TWOVALS_ADDR]], align 8
 // CHECK-NEXT:    [[VECEXT:%.*]] = extractelement <2 x i32> [[TMP0]], i64 0
-// CHECK-NEXT:    store i32 [[VECEXT]], ptr [[Z]], align 4
+// CHECK-NEXT:    store i32 [[VECEXT]], ptr [[Z]], align 1
 // CHECK-NEXT:    [[W:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOINTS]], ptr [[AGG_RESULT]], i32 0, i32 1
 // CHECK-NEXT:    [[TMP1:%.*]] = load <2 x i32>, ptr [[TWOVALS_ADDR]], align 8
 // CHECK-NEXT:    [[VECEXT1:%.*]] = extractelement <2 x i32> [[TMP1]], i64 1
-// CHECK-NEXT:    store i32 [[VECEXT1]], ptr [[W]], align 4
+// CHECK-NEXT:    store i32 [[VECEXT1]], ptr [[W]], align 1
 // CHECK-NEXT:    ret void
 //
 TwoInts case5(int2 TwoVals) {
@@ -136,18 +136,18 @@ TwoInts case5(int2 TwoVals) {
 // Case 6: Initialization from a scalarized structure of different type with
 // different element types.
 // CHECK-LABEL: define void @_Z5case69TwoFloats(
-// CHECK-SAME: ptr dead_on_unwind noalias writable sret([[STRUCT_TWOINTS:%.*]]) align 4 [[AGG_RESULT:%.*]], ptr noundef byval([[STRUCT_TWOFLOATS:%.*]]) align 4 [[TF4:%.*]]) #[[ATTR0]] {
+// CHECK-SAME: ptr dead_on_unwind noalias writable sret([[STRUCT_TWOINTS:%.*]]) align 1 [[AGG_RESULT:%.*]], ptr noundef byval([[STRUCT_TWOFLOATS:%.*]]) align 1 [[TF4:%.*]]) #[[ATTR0]] {
 // CHECK-NEXT:  [[ENTRY:.*:]]
 // CHECK-NEXT:    [[Z:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOINTS]], ptr [[AGG_RESULT]], i32 0, i32 0
 // CHECK-NEXT:    [[X:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOFLOATS]], ptr [[TF4]], i32 0, i32 0
-// CHECK-NEXT:    [[TMP0:%.*]] = load float, ptr [[X]], align 4
+// CHECK-NEXT:    [[TMP0:%.*]] = load float, ptr [[X]], align 1
 // CHECK-NEXT:    [[CONV:%.*]] = fptosi float [[TMP0]] to i32
-// CHECK-NEXT:    store i32 [[CONV]], ptr [[Z]], align 4
+// CHECK-NEXT:    store i32 [[CONV]], ptr [[Z]], align 1
 // CHECK-NEXT:    [[W:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOINTS]], ptr [[AGG_RESULT]], i32 0, i32 1
 // CHECK-NEXT:    [[Y:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOFLOATS]], ptr [[TF4]], i32 0, i32 1
-// CHECK-NEXT:    [[TMP1:%.*]] = load float, ptr [[Y]], align 4
+// CHECK-NEXT:    [[TMP1:%.*]] = load float, ptr [[Y]], align 1
 // CHECK-NEXT:    [[CONV1:%.*]] = fptosi float [[TMP1]] to i32
-// CHECK-NEXT:    store i32 [[CONV1]], ptr [[W]], align 4
+// CHECK-NEXT:    store i32 [[CONV1]], ptr [[W]], align 1
 // CHECK-NEXT:    ret void
 //
 TwoInts case6(TwoFloats TF4) {
@@ -158,59 +158,59 @@ TwoInts case6(TwoFloats TF4) {
 // Case 7: Initialization of a complex structure, with bogus braces and element
 // conversions from a collection of scalar values, and structures.
 // CHECK-LABEL: define void @_Z5case77TwoIntsS_i9TwoFloatsS0_S0_S0_(
-// CHECK-SAME: ptr dead_on_unwind noalias writable sret([[STRUCT_DOGGO:%.*]]) align 16 [[AGG_RESULT:%.*]], ptr noundef byval([[STRUCT_TWOINTS:%.*]]) align 4 [[TI1:%.*]], ptr noundef byval([[STRUCT_TWOINTS]]) align 4 [[TI2:%.*]], i32 noundef [[VAL:%.*]], ptr noundef byval([[STRUCT_TWOFLOATS:%.*]]) align 4 [[TF1:%.*]], ptr noundef byval([[STRUCT_TWOFLOATS]]) align 4 [[TF2:%.*]], ptr noundef byval([[STRUCT_TWOFLOATS]]) align 4 [[TF3:%.*]], ptr noundef byval([[STRUCT_TWOFLOATS]]) align 4 [[TF4:%.*]]) #[[ATTR0]] {
+// CHECK-SAME: ptr dead_on_unwind noalias writable sret([[STRUCT_DOGGO:%.*]]) align 1 [[AGG_RESULT:%.*]], ptr noundef byval([[STRUCT_TWOINTS:%.*]]) align 1 [[TI1:%.*]], ptr noundef byval([[STRUCT_TWOINTS]]) align 1 [[TI2:%.*]], i32 noundef [[VAL:%.*]], ptr noundef byval([[STRUCT_TWOFLOATS:%.*]]) align 1 [[TF1:%.*]], ptr noundef byval([[STRUCT_TWOFLOATS]]) align 1 [[TF2:%.*]], ptr noundef byval([[STRUCT_TWOFLOATS]]) align 1 [[TF3:%.*]], ptr noundef byval([[STRUCT_TWOFLOATS]]) align 1 [[TF4:%.*]]) #[[ATTR0]] {
 // CHECK-NEXT:  [[ENTRY:.*:]]
 // CHECK-NEXT:    [[VAL_ADDR:%.*]] = alloca i32, align 4
 // CHECK-NEXT:    store i32 [[VAL]], ptr [[VAL_ADDR]], align 4
 // CHECK-NEXT:    [[LEGSTATE:%.*]] = getelementptr inbounds nuw [[STRUCT_DOGGO]], ptr [[AGG_RESULT]], i32 0, i32 0
 // CHECK-NEXT:    [[Z:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOINTS]], ptr [[TI1]], i32 0, i32 0
-// CHECK-NEXT:    [[TMP0:%.*]] = load i32, ptr [[Z]], align 4
+// CHECK-NEXT:    [[TMP0:%.*]] = load i32, ptr [[Z]], align 1
 // CHECK-NEXT:    [[VECINIT:%.*]] = insertelement <4 x i32> poison, i32 [[TMP0]], i32 0
 // CHECK-NEXT:    [[W:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOINTS]], ptr [[TI1]], i32 0, i32 1
-// CHECK-NEXT:    [[TMP1:%.*]] = load i32, ptr [[W]], align 4
+// CHECK-NEXT:    [[TMP1:%.*]] = load i32, ptr [[W]], align 1
 // CHECK-NEXT:    [[VECINIT1:%.*]] = insertelement <4 x i32> [[VECINIT]], i32 [[TMP1]], i32 1
 // CHECK-NEXT:    [[Z2:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOINTS]], ptr [[TI2]], i32 0, i32 0
-// CHECK-NEXT:    [[TMP2:%.*]] = load i32, ptr [[Z2]], align 4
+// CHECK-NEXT:    [[TMP2:%.*]] = load i32, ptr [[Z2]], align 1
 // CHECK-NEXT:    [[VECINIT3:%.*]] = insertelement <4 x i32> [[VECINIT1]], i32 [[TMP2]], i32 2
 // CHECK-NEXT:    [[W4:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOINTS]], ptr [[TI2]], i32 0, i32 1
-// CHECK-NEXT:    [[TMP3:%.*]] = load i32, ptr [[W4]], align 4
+// CHECK-NEXT:    [[TMP3:%.*]] = load i32, ptr [[W4]], align 1
 // CHECK-NEXT:    [[VECINIT5:%.*]] = insertelement <4 x i32> [[VECINIT3]], i32 [[TMP3]], i32 3
-// CHECK-NEXT:    store <4 x i32> [[VECINIT5]], ptr [[LEGSTATE]], align 16
+// CHECK-NEXT:    store <4 x i32> [[VECINIT5]], ptr [[LEGSTATE]], align 1
 // CHECK-NEXT:    [[TAILSTATE:%.*]] = getelementptr inbounds nuw [[STRUCT_DOGGO]], ptr [[AGG_RESULT]], i32 0, i32 1
 // CHECK-NEXT:    [[TMP4:%.*]] = load i32, ptr [[VAL_ADDR]], align 4
-// CHECK-NEXT:    store i32 [[TMP4]], ptr [[TAILSTATE]], align 16
+// CHECK-NEXT:    store i32 [[TMP4]], ptr [[TAILSTATE]], align 1
 // CHECK-NEXT:    [[HAIRCOUNT:%.*]] = getelementptr inbounds nuw [[STRUCT_DOGGO]], ptr [[AGG_RESULT]], i32 0, i32 2
 // CHECK-NEXT:    [[TMP5:%.*]] = load i32, ptr [[VAL_ADDR]], align 4
 // CHECK-NEXT:    [[CONV:%.*]] = sitofp i32 [[TMP5]] to float
-// CHECK-NEXT:    store float [[CONV]], ptr [[HAIRCOUNT]], align 4
+// CHECK-NEXT:    store float [[CONV]], ptr [[HAIRCOUNT]], align 1
 // CHECK-NEXT:    [[EARDIRECTION:%.*]] = getelementptr inbounds nuw [[STRUCT_DOGGO]], ptr [[AGG_RESULT]], i32 0, i32 3
 // CHECK-NEXT:    [[X:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOFLOATS]], ptr [[TF1]], i32 0, i32 0
-// CHECK-NEXT:    [[TMP6:%.*]] = load float, ptr [[X]], align 4
+// CHECK-NEXT:    [[TMP6:%.*]] = load float, ptr [[X]], align 1
 // CHECK-NEXT:    [[VECINIT6:%.*]] = insertelement <4 x float> poison, float [[TMP6]], i32 0
 // CHECK-NEXT:    [[Y:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOFLOATS]], ptr [[TF1]], i32 0, i32 1
-// CHECK-NEXT:    [[TMP7:%.*]] = load float, ptr [[Y]], align 4
+// CHECK-NEXT:    [[TMP7:%.*]] = load float, ptr [[Y]], align 1
 // CHECK-NEXT:    [[VECINIT7:%.*]] = insertelement <4 x float> [[VECINIT6]], float [[TMP7]], i32 1
 // CHECK-NEXT:    [[X8:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOFLOATS]], ptr [[TF2]], i32 0, i32 0
-// CHECK-NEXT:    [[TMP8:%.*]] = load float, ptr [[X8]], align 4
+// CHECK-NEXT:    [[TMP8:%.*]] = load float, ptr [[X8]], align 1
 // CHECK-NEXT:    [[VECINIT9:%.*]] = insertelement <4 x float> [[VECINIT7]], float [[TMP8]], i32 2
 // CHECK-NEXT:    [[Y10:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOFLOATS]], ptr [[TF2]], i32 0, i32 1
-// CHECK-NEXT:    [[TMP9:%.*]] = load float, ptr [[Y10]], align 4
+// CHECK-NEXT:    [[TMP9:%.*]] = load float, ptr [[Y10]], align 1
 // CHECK-NEXT:    [[VECINIT11:%.*]] = insertelement <4 x float> [[VECINIT9]], float [[TMP9]], i32 3
-// CHECK-NEXT:    store <4 x float> [[VECINIT11]], ptr [[EARDIRECTION]], align 16
+// CHECK-NEXT:    store <4 x float> [[VECINIT11]], pt...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Apr 25, 2025

@llvm/pr-subscribers-hlsl

Author: Justin Bogner (bogner)

Changes

Fixes #121010.


Patch is 133.63 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/137391.diff

25 Files Affected:

  • (modified) clang/lib/Sema/SemaDecl.cpp (+4)
  • (modified) clang/lib/Sema/SemaTemplate.cpp (+4)
  • (modified) clang/test/CodeGenHLSL/ArrayAssignable.hlsl (+3-3)
  • (modified) clang/test/CodeGenHLSL/ArrayTemporary.hlsl (+3-3)
  • (modified) clang/test/CodeGenHLSL/BasicFeatures/AggregateSplatCast.hlsl (+2-2)
  • (modified) clang/test/CodeGenHLSL/BasicFeatures/ArrayElementwiseCast.hlsl (+4-5)
  • (modified) clang/test/CodeGenHLSL/BasicFeatures/InitLists.hlsl (+232-252)
  • (modified) clang/test/CodeGenHLSL/BasicFeatures/OutputArguments.hlsl (+11-17)
  • (modified) clang/test/CodeGenHLSL/BasicFeatures/StructElementwiseCast.hlsl (+17-17)
  • (modified) clang/test/CodeGenHLSL/BasicFeatures/VectorElementwiseCast.hlsl (+8-8)
  • (modified) clang/test/CodeGenHLSL/BoolVector.hlsl (+9-9)
  • (modified) clang/test/CodeGenHLSL/builtins/AppendStructuredBuffer-elementtype.hlsl (+1-1)
  • (modified) clang/test/CodeGenHLSL/builtins/ConsumeStructuredBuffer-elementtype.hlsl (+1-1)
  • (modified) clang/test/CodeGenHLSL/builtins/RasterizerOrderedStructuredBuffer-elementtype.hlsl (+1-1)
  • (modified) clang/test/CodeGenHLSL/builtins/StructuredBuffers-subscripts.hlsl (+6-6)
  • (modified) clang/test/CodeGenHLSL/builtins/hlsl_resource_t.hlsl (+1-1)
  • (modified) clang/test/CodeGenHLSL/cbuffer.hlsl (+12-12)
  • (modified) clang/test/CodeGenHLSL/cbuffer_and_namespaces.hlsl (+1-1)
  • (modified) clang/test/CodeGenHLSL/default_cbuffer_with_layout.hlsl (+1-1)
  • (modified) clang/test/CodeGenHLSL/implicit-norecurse-attrib.hlsl (+2-2)
  • (modified) clang/test/CodeGenHLSL/sret_output.hlsl (+1-1)
  • (modified) clang/test/CodeGenHLSL/this-assignment-overload.hlsl (+13-13)
  • (modified) clang/test/CodeGenHLSL/this-assignment.hlsl (+8-8)
  • (modified) clang/test/CodeGenHLSL/this-reference.hlsl (+3-3)
  • (added) clang/test/SemaHLSL/PackedStruct.hlsl (+13)
diff --git a/clang/lib/Sema/SemaDecl.cpp b/clang/lib/Sema/SemaDecl.cpp
index fe61b92e087d7..411ece25b4344 100644
--- a/clang/lib/Sema/SemaDecl.cpp
+++ b/clang/lib/Sema/SemaDecl.cpp
@@ -17567,6 +17567,8 @@ Sema::ActOnTag(Scope *S, unsigned TagSpec, TagUseKind TUK, SourceLocation KWLoc,
       // parsing of the struct).
       if (TUK == TagUseKind::Definition &&
           (!SkipBody || !SkipBody->ShouldSkip)) {
+        if (LangOpts.HLSL)
+          RD->addAttr(PackedAttr::CreateImplicit(Context));
         AddAlignmentAttributesForRecord(RD);
         AddMsStructLayoutForRecord(RD);
       }
@@ -18257,6 +18259,8 @@ Sema::ActOnTag(Scope *S, unsigned TagSpec, TagUseKind TUK, SourceLocation KWLoc,
     // the #pragma tokens are effectively skipped over during the
     // parsing of the struct).
     if (TUK == TagUseKind::Definition && (!SkipBody || !SkipBody->ShouldSkip)) {
+      if (LangOpts.HLSL)
+        RD->addAttr(PackedAttr::CreateImplicit(Context));
       AddAlignmentAttributesForRecord(RD);
       AddMsStructLayoutForRecord(RD);
     }
diff --git a/clang/lib/Sema/SemaTemplate.cpp b/clang/lib/Sema/SemaTemplate.cpp
index 894f072d84989..e5c3fc536a281 100644
--- a/clang/lib/Sema/SemaTemplate.cpp
+++ b/clang/lib/Sema/SemaTemplate.cpp
@@ -2116,6 +2116,8 @@ DeclResult Sema::CheckClassTemplate(
   // Add alignment attributes if necessary; these attributes are checked when
   // the ASTContext lays out the structure.
   if (TUK == TagUseKind::Definition && (!SkipBody || !SkipBody->ShouldSkip)) {
+    if (LangOpts.HLSL)
+      NewClass->addAttr(PackedAttr::CreateImplicit(Context));
     AddAlignmentAttributesForRecord(NewClass);
     AddMsStructLayoutForRecord(NewClass);
   }
@@ -8655,6 +8657,8 @@ DeclResult Sema::ActOnClassTemplateSpecialization(
   // Add alignment attributes if necessary; these attributes are checked when
   // the ASTContext lays out the structure.
   if (TUK == TagUseKind::Definition && (!SkipBody || !SkipBody->ShouldSkip)) {
+    if (LangOpts.HLSL)
+      Specialization->addAttr(PackedAttr::CreateImplicit(Context));
     AddAlignmentAttributesForRecord(Specialization);
     AddMsStructLayoutForRecord(Specialization);
   }
diff --git a/clang/test/CodeGenHLSL/ArrayAssignable.hlsl b/clang/test/CodeGenHLSL/ArrayAssignable.hlsl
index 6374f91230546..c3204570d6ef3 100644
--- a/clang/test/CodeGenHLSL/ArrayAssignable.hlsl
+++ b/clang/test/CodeGenHLSL/ArrayAssignable.hlsl
@@ -10,7 +10,7 @@ struct S {
 // CHECK: @c1 = external addrspace(2) global [2 x float], align 4
 // CHECK: @c2 = external addrspace(2) global [2 x <4 x i32>], align 16
 // CHECK: @c3 = external addrspace(2) global [2 x [2 x i32]], align 4
-// CHECK: @c4 = external addrspace(2) global [1 x target("dx.Layout", %S, 8, 0, 4)], align 4
+// CHECK: @c4 = external addrspace(2) global [1 x target("dx.Layout", %S, 8, 0, 4)], align 1
 
 cbuffer CBArrays : register(b0) {
   float c1[2];
@@ -169,8 +169,8 @@ void arr_assign10() {
 }
 
 // CHECK-LABEL: define void {{.*}}arr_assign11
-// CHECK: [[C:%.*]] = alloca [1 x %struct.S], align 4
-// CHECK: call void @llvm.memcpy.p0.p2.i32(ptr align 4 [[C]], ptr addrspace(2) align 4 @c4, i32 8, i1 false)
+// CHECK: [[C:%.*]] = alloca [1 x %struct.S], align 1
+// CHECK: call void @llvm.memcpy.p0.p2.i32(ptr align 1 [[C]], ptr addrspace(2) align 1 @c4, i32 8, i1 false)
 // CHECK-NEXT: ret void
 void arr_assign11() {
   S s = {1, 2.0};
diff --git a/clang/test/CodeGenHLSL/ArrayTemporary.hlsl b/clang/test/CodeGenHLSL/ArrayTemporary.hlsl
index 91a283554459d..29ea896045bb1 100644
--- a/clang/test/CodeGenHLSL/ArrayTemporary.hlsl
+++ b/clang/test/CodeGenHLSL/ArrayTemporary.hlsl
@@ -24,9 +24,9 @@ void fn2(Obj O[4]) { }
 // CHECK-LABEL: define void {{.*}}call2{{.*}}
 // CHECK: [[Arr:%.*]] = alloca [4 x %struct.Obj]
 // CHECK: [[Tmp:%.*]] = alloca [4 x %struct.Obj]
-// CHECK: call void @llvm.memset.p0.i32(ptr align 4 [[Arr]], i8 0, i32 32, i1 false)
-// CHECK: call void @llvm.memcpy.p0.p0.i32(ptr align 4 [[Tmp]], ptr align 4 [[Arr]], i32 32, i1 false)
-// CHECK: call void {{.*}}fn2{{.*}}(ptr noundef byval([4 x %struct.Obj]) align 4 [[Tmp]])
+// CHECK: call void @llvm.memset.p0.i32(ptr align 1 [[Arr]], i8 0, i32 32, i1 false)
+// CHECK: call void @llvm.memcpy.p0.p0.i32(ptr align 1 [[Tmp]], ptr align 1 [[Arr]], i32 32, i1 false)
+// CHECK: call void {{.*}}fn2{{.*}}(ptr noundef byval([4 x %struct.Obj]) align 1 [[Tmp]])
 void call2() {
   Obj Arr[4] = {{0, 0}, {0, 0}, {0, 0}, {0, 0}};
   fn2(Arr);
diff --git a/clang/test/CodeGenHLSL/BasicFeatures/AggregateSplatCast.hlsl b/clang/test/CodeGenHLSL/BasicFeatures/AggregateSplatCast.hlsl
index 42b6abec1b3d8..512fcd435191a 100644
--- a/clang/test/CodeGenHLSL/BasicFeatures/AggregateSplatCast.hlsl
+++ b/clang/test/CodeGenHLSL/BasicFeatures/AggregateSplatCast.hlsl
@@ -55,7 +55,7 @@ struct S {
 // struct splats
 // CHECK-LABEL: define void {{.*}}call3
 // CHECK: [[A:%.*]] = alloca <1 x i32>, align 4
-// CHECK: [[s:%.*]] = alloca %struct.S, align 4
+// CHECK: [[s:%.*]] = alloca %struct.S, align 1
 // CHECK-NEXT: store <1 x i32> splat (i32 1), ptr [[A]], align 4
 // CHECK-NEXT: [[L:%.*]] = load <1 x i32>, ptr [[A]], align 4
 // CHECK-NEXT: [[VL:%.*]] = extractelement <1 x i32> [[L]], i32 0
@@ -72,7 +72,7 @@ export void call3() {
 // struct splat from vector of length 1
 // CHECK-LABEL: define void {{.*}}call5
 // CHECK: [[A:%.*]] = alloca <1 x i32>, align 4
-// CHECK-NEXT: [[s:%.*]] = alloca %struct.S, align 4
+// CHECK-NEXT: [[s:%.*]] = alloca %struct.S, align 1
 // CHECK-NEXT: store <1 x i32> splat (i32 1), ptr [[A]], align 4
 // CHECK-NEXT: [[L:%.*]] = load <1 x i32>, ptr [[A]], align 4
 // CHECK-NEXT: [[VL:%.*]] = extractelement <1 x i32> [[L]], i32 0
diff --git a/clang/test/CodeGenHLSL/BasicFeatures/ArrayElementwiseCast.hlsl b/clang/test/CodeGenHLSL/BasicFeatures/ArrayElementwiseCast.hlsl
index 18f82bff3b308..ac02ddf5765ed 100644
--- a/clang/test/CodeGenHLSL/BasicFeatures/ArrayElementwiseCast.hlsl
+++ b/clang/test/CodeGenHLSL/BasicFeatures/ArrayElementwiseCast.hlsl
@@ -125,12 +125,12 @@ struct S {
 
 // flatten and truncate from a struct
 // CHECK-LABEL: define void {{.*}}call7
-// CHECK: [[s:%.*]] = alloca %struct.S, align 4
+// CHECK: [[s:%.*]] = alloca %struct.S, align 1
 // CHECK-NEXT: [[A:%.*]] = alloca [1 x i32], align 4
-// CHECK-NEXT: [[Tmp:%.*]] = alloca %struct.S, align 4
-// CHECK-NEXT: call void @llvm.memcpy.p0.p0.i32(ptr align 4 [[s]], ptr align 4 {{.*}}, i32 8, i1 false)
+// CHECK-NEXT: [[Tmp:%.*]] = alloca %struct.S, align 1
+// CHECK-NEXT: call void @llvm.memcpy.p0.p0.i32(ptr align 1 [[s]], ptr align 1 {{.*}}, i32 8, i1 false)
 // CHECK-NEXT: call void @llvm.memcpy.p0.p0.i32(ptr align 4 [[A]], ptr align 4 {{.*}}, i32 4, i1 false)
-// CHECK-NEXT: call void @llvm.memcpy.p0.p0.i32(ptr align 4 [[Tmp]], ptr align 4 [[s]], i32 8, i1 false)
+// CHECK-NEXT: call void @llvm.memcpy.p0.p0.i32(ptr align 1 [[Tmp]], ptr align 1 [[s]], i32 8, i1 false)
 // CHECK-NEXT: [[G1:%.*]] = getelementptr inbounds [1 x i32], ptr [[A]], i32 0, i32 0
 // CHECK-NEXT: [[G2:%.*]] = getelementptr inbounds %struct.S, ptr [[Tmp]], i32 0, i32 0
 // CHECK-NEXT: [[G3:%.*]] = getelementptr inbounds %struct.S, ptr [[Tmp]], i32 0, i32 1
@@ -141,4 +141,3 @@ export void call7() {
   int A[1] = {1};
   A = (int[1])s;
 }
-
diff --git a/clang/test/CodeGenHLSL/BasicFeatures/InitLists.hlsl b/clang/test/CodeGenHLSL/BasicFeatures/InitLists.hlsl
index a0590162c7087..d04583e4fc51a 100644
--- a/clang/test/CodeGenHLSL/BasicFeatures/InitLists.hlsl
+++ b/clang/test/CodeGenHLSL/BasicFeatures/InitLists.hlsl
@@ -47,9 +47,9 @@ struct SlicyBits {
 
 // Case 1: Extraneous braces get ignored in literal instantiation.
 // CHECK-LABEL: define void @_Z5case1v(
-// CHECK-SAME: ptr dead_on_unwind noalias writable sret([[STRUCT_TWOFLOATS:%.*]]) align 4 [[AGG_RESULT:%.*]]) #[[ATTR0:[0-9]+]] {
+// CHECK-SAME: ptr dead_on_unwind noalias writable sret([[STRUCT_TWOFLOATS:%.*]]) align 1 [[AGG_RESULT:%.*]]) #[[ATTR0:[0-9]+]] {
 // CHECK-NEXT:  [[ENTRY:.*:]]
-// CHECK-NEXT:    call void @llvm.memcpy.p0.p0.i32(ptr align 4 [[AGG_RESULT]], ptr align 4 @__const._Z5case1v.TF1, i32 8, i1 false)
+// CHECK-NEXT:    call void @llvm.memcpy.p0.p0.i32(ptr align 1 [[AGG_RESULT]], ptr align 1 @__const._Z5case1v.TF1, i32 8, i1 false)
 // CHECK-NEXT:    ret void
 //
 TwoFloats case1() {
@@ -59,9 +59,9 @@ TwoFloats case1() {
 
 // Case 2: Valid C/C++ initializer is handled appropriately.
 // CHECK-LABEL: define void @_Z5case2v(
-// CHECK-SAME: ptr dead_on_unwind noalias writable sret([[STRUCT_TWOFLOATS:%.*]]) align 4 [[AGG_RESULT:%.*]]) #[[ATTR0]] {
+// CHECK-SAME: ptr dead_on_unwind noalias writable sret([[STRUCT_TWOFLOATS:%.*]]) align 1 [[AGG_RESULT:%.*]]) #[[ATTR0]] {
 // CHECK-NEXT:  [[ENTRY:.*:]]
-// CHECK-NEXT:    call void @llvm.memcpy.p0.p0.i32(ptr align 4 [[AGG_RESULT]], ptr align 4 @__const._Z5case2v.TF2, i32 8, i1 false)
+// CHECK-NEXT:    call void @llvm.memcpy.p0.p0.i32(ptr align 1 [[AGG_RESULT]], ptr align 1 @__const._Z5case2v.TF2, i32 8, i1 false)
 // CHECK-NEXT:    ret void
 //
 TwoFloats case2() {
@@ -71,16 +71,16 @@ TwoFloats case2() {
 
 // Case 3: Simple initialization with conversion of an argument.
 // CHECK-LABEL: define void @_Z5case3i(
-// CHECK-SAME: ptr dead_on_unwind noalias writable sret([[STRUCT_TWOFLOATS:%.*]]) align 4 [[AGG_RESULT:%.*]], i32 noundef [[VAL:%.*]]) #[[ATTR0]] {
+// CHECK-SAME: ptr dead_on_unwind noalias writable sret([[STRUCT_TWOFLOATS:%.*]]) align 1 [[AGG_RESULT:%.*]], i32 noundef [[VAL:%.*]]) #[[ATTR0]] {
 // CHECK-NEXT:  [[ENTRY:.*:]]
 // CHECK-NEXT:    [[VAL_ADDR:%.*]] = alloca i32, align 4
 // CHECK-NEXT:    store i32 [[VAL]], ptr [[VAL_ADDR]], align 4
 // CHECK-NEXT:    [[X:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOFLOATS]], ptr [[AGG_RESULT]], i32 0, i32 0
 // CHECK-NEXT:    [[TMP0:%.*]] = load i32, ptr [[VAL_ADDR]], align 4
 // CHECK-NEXT:    [[CONV:%.*]] = sitofp i32 [[TMP0]] to float
-// CHECK-NEXT:    store float [[CONV]], ptr [[X]], align 4
+// CHECK-NEXT:    store float [[CONV]], ptr [[X]], align 1
 // CHECK-NEXT:    [[Y:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOFLOATS]], ptr [[AGG_RESULT]], i32 0, i32 1
-// CHECK-NEXT:    store float 2.000000e+00, ptr [[Y]], align 4
+// CHECK-NEXT:    store float 2.000000e+00, ptr [[Y]], align 1
 // CHECK-NEXT:    ret void
 //
 TwoFloats case3(int Val) {
@@ -91,7 +91,7 @@ TwoFloats case3(int Val) {
 // Case 4: Initialization from a scalarized vector into a structure with element
 // conversions.
 // CHECK-LABEL: define void @_Z5case4Dv2_i(
-// CHECK-SAME: ptr dead_on_unwind noalias writable sret([[STRUCT_TWOFLOATS:%.*]]) align 4 [[AGG_RESULT:%.*]], <2 x i32> noundef [[TWOVALS:%.*]]) #[[ATTR0]] {
+// CHECK-SAME: ptr dead_on_unwind noalias writable sret([[STRUCT_TWOFLOATS:%.*]]) align 1 [[AGG_RESULT:%.*]], <2 x i32> noundef [[TWOVALS:%.*]]) #[[ATTR0]] {
 // CHECK-NEXT:  [[ENTRY:.*:]]
 // CHECK-NEXT:    [[TWOVALS_ADDR:%.*]] = alloca <2 x i32>, align 8
 // CHECK-NEXT:    store <2 x i32> [[TWOVALS]], ptr [[TWOVALS_ADDR]], align 8
@@ -99,12 +99,12 @@ TwoFloats case3(int Val) {
 // CHECK-NEXT:    [[TMP0:%.*]] = load <2 x i32>, ptr [[TWOVALS_ADDR]], align 8
 // CHECK-NEXT:    [[VECEXT:%.*]] = extractelement <2 x i32> [[TMP0]], i64 0
 // CHECK-NEXT:    [[CONV:%.*]] = sitofp i32 [[VECEXT]] to float
-// CHECK-NEXT:    store float [[CONV]], ptr [[X]], align 4
+// CHECK-NEXT:    store float [[CONV]], ptr [[X]], align 1
 // CHECK-NEXT:    [[Y:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOFLOATS]], ptr [[AGG_RESULT]], i32 0, i32 1
 // CHECK-NEXT:    [[TMP1:%.*]] = load <2 x i32>, ptr [[TWOVALS_ADDR]], align 8
 // CHECK-NEXT:    [[VECEXT1:%.*]] = extractelement <2 x i32> [[TMP1]], i64 1
 // CHECK-NEXT:    [[CONV2:%.*]] = sitofp i32 [[VECEXT1]] to float
-// CHECK-NEXT:    store float [[CONV2]], ptr [[Y]], align 4
+// CHECK-NEXT:    store float [[CONV2]], ptr [[Y]], align 1
 // CHECK-NEXT:    ret void
 //
 TwoFloats case4(int2 TwoVals) {
@@ -114,18 +114,18 @@ TwoFloats case4(int2 TwoVals) {
 
 // Case 5: Initialization from a scalarized vector of matching type.
 // CHECK-LABEL: define void @_Z5case5Dv2_i(
-// CHECK-SAME: ptr dead_on_unwind noalias writable sret([[STRUCT_TWOINTS:%.*]]) align 4 [[AGG_RESULT:%.*]], <2 x i32> noundef [[TWOVALS:%.*]]) #[[ATTR0]] {
+// CHECK-SAME: ptr dead_on_unwind noalias writable sret([[STRUCT_TWOINTS:%.*]]) align 1 [[AGG_RESULT:%.*]], <2 x i32> noundef [[TWOVALS:%.*]]) #[[ATTR0]] {
 // CHECK-NEXT:  [[ENTRY:.*:]]
 // CHECK-NEXT:    [[TWOVALS_ADDR:%.*]] = alloca <2 x i32>, align 8
 // CHECK-NEXT:    store <2 x i32> [[TWOVALS]], ptr [[TWOVALS_ADDR]], align 8
 // CHECK-NEXT:    [[Z:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOINTS]], ptr [[AGG_RESULT]], i32 0, i32 0
 // CHECK-NEXT:    [[TMP0:%.*]] = load <2 x i32>, ptr [[TWOVALS_ADDR]], align 8
 // CHECK-NEXT:    [[VECEXT:%.*]] = extractelement <2 x i32> [[TMP0]], i64 0
-// CHECK-NEXT:    store i32 [[VECEXT]], ptr [[Z]], align 4
+// CHECK-NEXT:    store i32 [[VECEXT]], ptr [[Z]], align 1
 // CHECK-NEXT:    [[W:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOINTS]], ptr [[AGG_RESULT]], i32 0, i32 1
 // CHECK-NEXT:    [[TMP1:%.*]] = load <2 x i32>, ptr [[TWOVALS_ADDR]], align 8
 // CHECK-NEXT:    [[VECEXT1:%.*]] = extractelement <2 x i32> [[TMP1]], i64 1
-// CHECK-NEXT:    store i32 [[VECEXT1]], ptr [[W]], align 4
+// CHECK-NEXT:    store i32 [[VECEXT1]], ptr [[W]], align 1
 // CHECK-NEXT:    ret void
 //
 TwoInts case5(int2 TwoVals) {
@@ -136,18 +136,18 @@ TwoInts case5(int2 TwoVals) {
 // Case 6: Initialization from a scalarized structure of different type with
 // different element types.
 // CHECK-LABEL: define void @_Z5case69TwoFloats(
-// CHECK-SAME: ptr dead_on_unwind noalias writable sret([[STRUCT_TWOINTS:%.*]]) align 4 [[AGG_RESULT:%.*]], ptr noundef byval([[STRUCT_TWOFLOATS:%.*]]) align 4 [[TF4:%.*]]) #[[ATTR0]] {
+// CHECK-SAME: ptr dead_on_unwind noalias writable sret([[STRUCT_TWOINTS:%.*]]) align 1 [[AGG_RESULT:%.*]], ptr noundef byval([[STRUCT_TWOFLOATS:%.*]]) align 1 [[TF4:%.*]]) #[[ATTR0]] {
 // CHECK-NEXT:  [[ENTRY:.*:]]
 // CHECK-NEXT:    [[Z:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOINTS]], ptr [[AGG_RESULT]], i32 0, i32 0
 // CHECK-NEXT:    [[X:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOFLOATS]], ptr [[TF4]], i32 0, i32 0
-// CHECK-NEXT:    [[TMP0:%.*]] = load float, ptr [[X]], align 4
+// CHECK-NEXT:    [[TMP0:%.*]] = load float, ptr [[X]], align 1
 // CHECK-NEXT:    [[CONV:%.*]] = fptosi float [[TMP0]] to i32
-// CHECK-NEXT:    store i32 [[CONV]], ptr [[Z]], align 4
+// CHECK-NEXT:    store i32 [[CONV]], ptr [[Z]], align 1
 // CHECK-NEXT:    [[W:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOINTS]], ptr [[AGG_RESULT]], i32 0, i32 1
 // CHECK-NEXT:    [[Y:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOFLOATS]], ptr [[TF4]], i32 0, i32 1
-// CHECK-NEXT:    [[TMP1:%.*]] = load float, ptr [[Y]], align 4
+// CHECK-NEXT:    [[TMP1:%.*]] = load float, ptr [[Y]], align 1
 // CHECK-NEXT:    [[CONV1:%.*]] = fptosi float [[TMP1]] to i32
-// CHECK-NEXT:    store i32 [[CONV1]], ptr [[W]], align 4
+// CHECK-NEXT:    store i32 [[CONV1]], ptr [[W]], align 1
 // CHECK-NEXT:    ret void
 //
 TwoInts case6(TwoFloats TF4) {
@@ -158,59 +158,59 @@ TwoInts case6(TwoFloats TF4) {
 // Case 7: Initialization of a complex structure, with bogus braces and element
 // conversions from a collection of scalar values, and structures.
 // CHECK-LABEL: define void @_Z5case77TwoIntsS_i9TwoFloatsS0_S0_S0_(
-// CHECK-SAME: ptr dead_on_unwind noalias writable sret([[STRUCT_DOGGO:%.*]]) align 16 [[AGG_RESULT:%.*]], ptr noundef byval([[STRUCT_TWOINTS:%.*]]) align 4 [[TI1:%.*]], ptr noundef byval([[STRUCT_TWOINTS]]) align 4 [[TI2:%.*]], i32 noundef [[VAL:%.*]], ptr noundef byval([[STRUCT_TWOFLOATS:%.*]]) align 4 [[TF1:%.*]], ptr noundef byval([[STRUCT_TWOFLOATS]]) align 4 [[TF2:%.*]], ptr noundef byval([[STRUCT_TWOFLOATS]]) align 4 [[TF3:%.*]], ptr noundef byval([[STRUCT_TWOFLOATS]]) align 4 [[TF4:%.*]]) #[[ATTR0]] {
+// CHECK-SAME: ptr dead_on_unwind noalias writable sret([[STRUCT_DOGGO:%.*]]) align 1 [[AGG_RESULT:%.*]], ptr noundef byval([[STRUCT_TWOINTS:%.*]]) align 1 [[TI1:%.*]], ptr noundef byval([[STRUCT_TWOINTS]]) align 1 [[TI2:%.*]], i32 noundef [[VAL:%.*]], ptr noundef byval([[STRUCT_TWOFLOATS:%.*]]) align 1 [[TF1:%.*]], ptr noundef byval([[STRUCT_TWOFLOATS]]) align 1 [[TF2:%.*]], ptr noundef byval([[STRUCT_TWOFLOATS]]) align 1 [[TF3:%.*]], ptr noundef byval([[STRUCT_TWOFLOATS]]) align 1 [[TF4:%.*]]) #[[ATTR0]] {
 // CHECK-NEXT:  [[ENTRY:.*:]]
 // CHECK-NEXT:    [[VAL_ADDR:%.*]] = alloca i32, align 4
 // CHECK-NEXT:    store i32 [[VAL]], ptr [[VAL_ADDR]], align 4
 // CHECK-NEXT:    [[LEGSTATE:%.*]] = getelementptr inbounds nuw [[STRUCT_DOGGO]], ptr [[AGG_RESULT]], i32 0, i32 0
 // CHECK-NEXT:    [[Z:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOINTS]], ptr [[TI1]], i32 0, i32 0
-// CHECK-NEXT:    [[TMP0:%.*]] = load i32, ptr [[Z]], align 4
+// CHECK-NEXT:    [[TMP0:%.*]] = load i32, ptr [[Z]], align 1
 // CHECK-NEXT:    [[VECINIT:%.*]] = insertelement <4 x i32> poison, i32 [[TMP0]], i32 0
 // CHECK-NEXT:    [[W:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOINTS]], ptr [[TI1]], i32 0, i32 1
-// CHECK-NEXT:    [[TMP1:%.*]] = load i32, ptr [[W]], align 4
+// CHECK-NEXT:    [[TMP1:%.*]] = load i32, ptr [[W]], align 1
 // CHECK-NEXT:    [[VECINIT1:%.*]] = insertelement <4 x i32> [[VECINIT]], i32 [[TMP1]], i32 1
 // CHECK-NEXT:    [[Z2:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOINTS]], ptr [[TI2]], i32 0, i32 0
-// CHECK-NEXT:    [[TMP2:%.*]] = load i32, ptr [[Z2]], align 4
+// CHECK-NEXT:    [[TMP2:%.*]] = load i32, ptr [[Z2]], align 1
 // CHECK-NEXT:    [[VECINIT3:%.*]] = insertelement <4 x i32> [[VECINIT1]], i32 [[TMP2]], i32 2
 // CHECK-NEXT:    [[W4:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOINTS]], ptr [[TI2]], i32 0, i32 1
-// CHECK-NEXT:    [[TMP3:%.*]] = load i32, ptr [[W4]], align 4
+// CHECK-NEXT:    [[TMP3:%.*]] = load i32, ptr [[W4]], align 1
 // CHECK-NEXT:    [[VECINIT5:%.*]] = insertelement <4 x i32> [[VECINIT3]], i32 [[TMP3]], i32 3
-// CHECK-NEXT:    store <4 x i32> [[VECINIT5]], ptr [[LEGSTATE]], align 16
+// CHECK-NEXT:    store <4 x i32> [[VECINIT5]], ptr [[LEGSTATE]], align 1
 // CHECK-NEXT:    [[TAILSTATE:%.*]] = getelementptr inbounds nuw [[STRUCT_DOGGO]], ptr [[AGG_RESULT]], i32 0, i32 1
 // CHECK-NEXT:    [[TMP4:%.*]] = load i32, ptr [[VAL_ADDR]], align 4
-// CHECK-NEXT:    store i32 [[TMP4]], ptr [[TAILSTATE]], align 16
+// CHECK-NEXT:    store i32 [[TMP4]], ptr [[TAILSTATE]], align 1
 // CHECK-NEXT:    [[HAIRCOUNT:%.*]] = getelementptr inbounds nuw [[STRUCT_DOGGO]], ptr [[AGG_RESULT]], i32 0, i32 2
 // CHECK-NEXT:    [[TMP5:%.*]] = load i32, ptr [[VAL_ADDR]], align 4
 // CHECK-NEXT:    [[CONV:%.*]] = sitofp i32 [[TMP5]] to float
-// CHECK-NEXT:    store float [[CONV]], ptr [[HAIRCOUNT]], align 4
+// CHECK-NEXT:    store float [[CONV]], ptr [[HAIRCOUNT]], align 1
 // CHECK-NEXT:    [[EARDIRECTION:%.*]] = getelementptr inbounds nuw [[STRUCT_DOGGO]], ptr [[AGG_RESULT]], i32 0, i32 3
 // CHECK-NEXT:    [[X:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOFLOATS]], ptr [[TF1]], i32 0, i32 0
-// CHECK-NEXT:    [[TMP6:%.*]] = load float, ptr [[X]], align 4
+// CHECK-NEXT:    [[TMP6:%.*]] = load float, ptr [[X]], align 1
 // CHECK-NEXT:    [[VECINIT6:%.*]] = insertelement <4 x float> poison, float [[TMP6]], i32 0
 // CHECK-NEXT:    [[Y:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOFLOATS]], ptr [[TF1]], i32 0, i32 1
-// CHECK-NEXT:    [[TMP7:%.*]] = load float, ptr [[Y]], align 4
+// CHECK-NEXT:    [[TMP7:%.*]] = load float, ptr [[Y]], align 1
 // CHECK-NEXT:    [[VECINIT7:%.*]] = insertelement <4 x float> [[VECINIT6]], float [[TMP7]], i32 1
 // CHECK-NEXT:    [[X8:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOFLOATS]], ptr [[TF2]], i32 0, i32 0
-// CHECK-NEXT:    [[TMP8:%.*]] = load float, ptr [[X8]], align 4
+// CHECK-NEXT:    [[TMP8:%.*]] = load float, ptr [[X8]], align 1
 // CHECK-NEXT:    [[VECINIT9:%.*]] = insertelement <4 x float> [[VECINIT7]], float [[TMP8]], i32 2
 // CHECK-NEXT:    [[Y10:%.*]] = getelementptr inbounds nuw [[STRUCT_TWOFLOATS]], ptr [[TF2]], i32 0, i32 1
-// CHECK-NEXT:    [[TMP9:%.*]] = load float, ptr [[Y10]], align 4
+// CHECK-NEXT:    [[TMP9:%.*]] = load float, ptr [[Y10]], align 1
 // CHECK-NEXT:    [[VECINIT11:%.*]] = insertelement <4 x float> [[VECINIT9]], float [[TMP9]], i32 3
-// CHECK-NEXT:    store <4 x float> [[VECINIT11]], ptr [[EARDIRECTION]], align 16
+// CHECK-NEXT:    store <4 x float> [[VECINIT11]], pt...
[truncated]

Copy link
Contributor

@damyanp damyanp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there offload test suite tests to go with this?

@bogner
Copy link
Contributor Author

bogner commented Apr 25, 2025

Are there offload test suite tests to go with this?

llvm-beanz/offload-test-suite#83

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category HLSL HLSL Language Support
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[HLSL] Treat structures as packed
3 participants