Skip to content

Conversation

HertzDevil
Copy link
Contributor

@HertzDevil HertzDevil commented Jan 30, 2024

Fixes the codegen issue in #14277 (comment).

If alignof(Int128) == 16 (due to #13609 this is not true for x86-64 targets before LLVM 18), An Int32 | Int128 will now place its type ID and data at offsets 0 and 16 respectively, including on AArch64, where alignof(Int128) == 16 has always been the case. This also means that if, say, a 64-byte-aligned value becomes available in Crystal (e.g. for AVX-512), then a union of that value with anything else requires at least 128 bytes of storage, with 60 bytes of padding after the type ID.

This is an ABI breaking change. It might be worth mentioning this even though Crystal makes no guarantees about ABI stability and most users won't be affected.

This doesn't cover the interpreter yet.

@HertzDevil HertzDevil added kind:bug A bug in the code. Does not apply to documentation, specs, etc. topic:compiler:codegen labels Jan 30, 2024
@HertzDevil HertzDevil marked this pull request as draft January 30, 2024 20:23
@HertzDevil
Copy link
Contributor Author

HertzDevil commented Jan 30, 2024

This patch is currently breaking upcasts like (1 || 1_i64).as(Int32 | Int64 | Int128) and downcasts like (1 || 1_i64 || 1_i128).as(Int32 | Int64), which are used in both the DWARF reader and the Crystal::System.print_error specs. The following is how a union-to-union upcast used to be done, using Crystal::CodeGenVisitor#assign_distinct_union_types:

x = 1 || 1_i64
y = x.as(Int32 | Int64 | Int128)
%x = alloca %"(Int32 | Int64)", align 8
%y = alloca %"(Int128 | Int32 | Int64)", align 8
%1 = alloca %"(Int128 | Int32 | Int64)", align 8
; ...
%5 = load %"(Int32 | Int64)", ptr %x, align 8
store %"(Int32 | Int64)" %5, ptr %1, align 8
%6 = load %"(Int128 | Int32 | Int64)", ptr %1, align 8
store %"(Int128 | Int32 | Int64)" %6, ptr %y, align 8
%7 = load %"(Int128 | Int32 | Int64)", ptr %1, align 8

In short, x and y had identical leading binary representations, and y = pointerof(x).as((Int32 | Int64 | Int128)*).value would have equally worked, since the 8 trailing bytes in y's representation have no effect. (#unsafe_as will not work since it triggers a dynamic dispatch.) Now that y's value offset is shifted, merely changing the alignments of the larger unions isn't going to work anymore.

Downcasts fail for a similar reason. It seems sidecasts such as (1 || 1_i64).as(Int32 | Int128) aren't affected.

@HertzDevil HertzDevil marked this pull request as ready for review January 30, 2024 21:33
@HertzDevil HertzDevil added the breaking-change May have breaking effect in some edge cases. Mostly negligible, but still noteworthy. label Jan 30, 2024
Copy link
Member

@straight-shoota straight-shoota left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't dive into the details but as long as it works... looks good 👍
CI is failing on aarc64 though.

@HertzDevil HertzDevil marked this pull request as draft January 31, 2024 17:54
@HertzDevil
Copy link
Contributor Author

HertzDevil commented Feb 1, 2024

Given:

a = uninitialized Int32 | Int64 | Int128
b = 1.as(Int32 | Int64)
a = b

Master produces:

%4 = load %"(Int32 | Int64)", ptr %b, align 8
store %"(Int32 | Int64)" %4, ptr %a, align 8

This PR currently produces:

%4 = getelementptr inbounds %"(Int32 | Int64)", ptr %b, i32 0, i32 0
%5 = load i32, ptr %4, align 4
%6 = getelementptr inbounds %"(Int32 | Int64)", ptr %b, i32 0, i32 1
%7 = getelementptr inbounds %"(Int128 | Int32 | Int64)", ptr %a, i32 0, i32 0
store i32 %5, ptr %7, align 4
%8 = getelementptr inbounds %"(Int128 | Int32 | Int64)", ptr %a, i32 0, i32 1
%9 = getelementptr inbounds %"(Int32 | Int64)", ptr %b, i32 0, i32 1
%10 = load [1 x i64], ptr %9, align 8
store [1 x i64] %10, ptr %8, align 8

That last store doesn't look right; it probably needs to be align 16, but then there is no easy way to cast the [1 x i64] into an appropriate target type. Instead I am now trying the below:

; ...
call void @llvm.memcpy.p0.p0.i64(ptr align 16 %8, ptr align 8 %9, i64 8, i1 false)

@HertzDevil HertzDevil marked this pull request as ready for review February 1, 2024 13:04
@straight-shoota straight-shoota added this to the 1.12.0 milestone Feb 1, 2024
@straight-shoota straight-shoota merged commit b655c0f into crystal-lang:master Feb 2, 2024
@HertzDevil HertzDevil deleted the bug/overaligned-union branch February 2, 2024 15:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking-change May have breaking effect in some edge cases. Mostly negligible, but still noteworthy. kind:bug A bug in the code. Does not apply to documentation, specs, etc. topic:compiler:codegen

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants