-
Notifications
You must be signed in to change notification settings - Fork 11.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[X86] LLVM >= 18 folding bitcast -> and -> bitcast to @llvm.fabs.f16
generates call to __extendhfsf2
#104915
Comments
Forming fabs is correct, the codegen is not. Promotion should bitcast to integer |
@llvm.fabs.f16
generates call to __extendhfsf2
@llvm.fabs.f16
generates call to __extendhfsf2
@llvm/issue-subscribers-backend-x86 Author: OverMighty (overmighty)
https://godbolt.org/z/on6Pj6Tsc
Starting from 5c0da58, InstCombine transforms this IR: define linkonce_odr hidden noundef half @<!-- -->_Float16 __llvm_libc_20_0_0_git::fputil::abs<_Float16, 0>(_Float16)(half noundef %x) local_unnamed_addr #<!-- -->1 comdat {
entry:
%0 = bitcast half %x to i16
%1 = and i16 %0, 32767
%2 = bitcast i16 %1 to half
ret half %2
} into: define linkonce_odr hidden noundef half @<!-- -->_Float16 __llvm_libc_20_0_0_git::fputil::abs<_Float16, 0>(_Float16)(half noundef %x) local_unnamed_addr #<!-- -->1 comdat {
entry:
%0 = call half @<!-- -->llvm.fabs.f16(half %x)
ret half %0
} On x86, when AVX-512 FP16 is not available, this generates something like: .LCPI0_0:
.long 0x7fffffff
__llvm_libc_20_0_0_git::fabsf16(_Float16):
push rbp
mov rbp, rsp
call __extendhfsf2@<!-- -->PLT
vbroadcastss xmm1, dword ptr [rip + .LCPI0_0]
vandps xmm0, xmm0, xmm1
call __truncsfhf2@<!-- -->PLT
pop rbp
ret whereas with LLVM 16, __llvm_libc_20_0_0_git::fabsf16(_Float16):
push rbp
mov rbp, rsp
vpextrw eax, xmm0, 0
and eax, 32767
vpinsrw xmm0, xmm0, eax, 0
pop rbp
ret Related:
cc @arsenm @lntue |
The same bug exists for fneg and copysign, and this is repeated in the type legalizer and promotion action |
Attaching partial patch. It's incomplete because it asserts on targets with legal f16, but no legal i16. I don't have the patience to fight with SelectionDAG's insistence that there can be no illegal types after type legalization |
Conditionally to the same sized integer type being legal. For example, for f16, if i16 is legal, bitcast to i16, clear the sign bit and bitcast back to f16. Adapted from a suggested patch from @arsenm (Matthew.Arsenault@amd.com): llvm#104915 (comment)
This isn't x86 specific; it affects |
https://godbolt.org/z/on6Pj6Tsc
Starting from 5c0da58, InstCombine transforms this IR:
into:
On x86, when AVX-512 FP16 is not available, this generates something like:
whereas with LLVM 16,
bitcast
->and
->bitcast
generates something like:Related:
__extendhfsf2
for bitcast -> and -> bitcast #104914cc @arsenm @lntue
The text was updated successfully, but these errors were encountered: