Skip to content

InstCombine of is.fpclass pessimizes codegen for comparison with fast math flags #104597

Open
@jayfoad

Description

@jayfoad

Test case:

define i1 @f(float %x) {
  %y = fmul nnan float%x, 3.0
  %i = bitcast float %y to i32
  %c = icmp ne i32 %i, 0
  ret i1 %c
}

AMD64 codegen:

$ llc -mtriple=amd64 < fpc.ll
...
	mulss	.LCPI0_0(%rip), %xmm0
	movd	%xmm0, %eax
	testl	%eax, %eax
	setne	%al

AMD64 codegen after InstCombine:

$ opt -S -passes=instcombine < fpc.ll | llc -mtriple=amd64
...
	mulss	.LCPI0_0(%rip), %xmm0
	movd	%xmm0, %eax
	movl	%eax, %ecx
	negl	%ecx
	seto	%cl
	andl	$2147483647, %eax               # imm = 0x7FFFFFFF
	cmpl	$2139095040, %eax               # imm = 0x7F800000
	sete	%dl
	orb	%cl, %dl
	leal	-1(%rax), %ecx
	cmpl	$8388607, %ecx                  # imm = 0x7FFFFF
	setb	%cl
	orb	%dl, %cl
	addl	$-8388608, %eax                 # imm = 0xFF800000
	cmpl	$2130706432, %eax               # imm = 0x7F000000
	setb	%al
	orb	%cl, %al

Here's what happens:

  1. InstCombine recognizes the icmp-of-bitcast and turns it into llvm.is.fpclass(%y, 0x3bf)
  2. InstCombine recognizes that %y cannot be nan and turns this into llvm.is.fpclass(%y, 0x3bc) (see
    // Clear test bits we know must be false from the source value.
    ). This seems dubious to me - it could just as well set these bits, or leave them alone. I guess clearing them works as a kind of canonicalization so that we could CSE two calls to llvm.is.fpclass that differ only in the "nan" bits if the argument is known not to be nan. Really it would be nice if we could set these bits to a "don't care" value.
  3. CodeGen does not recognize this as a special case and has to emit a long sequence of comparisons. This could be improved if the "not nan" information is still available at this point.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions