Skip to content

mul nuw+lshr exact should fold to a single multiplication (when the latter is a factor) #54824

Closed
@scottmcm

Description

@scottmcm

Alive for the specific example, with opt+llc: https://alive2.llvm.org/ce/z/7oofsh
+label: llvm:optimizations

I tried the following:

define i64 @src(i64 noundef %0) {
start:
  %1 = mul nuw nsw i64 %0, 52
  %2 = lshr exact i64 %1, 2
  ret i64 %2
}

Since 52 = 4 * 13 I expected to see that fold down to just a single mul:

define i64 @tgt(i64 noundef %0) {
start:
  %1 = mul nuw nsw i64 %0, 13
  ret i64 %1
}

But it doesn't -- the mul and lshr are left in the optimized IR, and aren't folded by the target either:

src:                                    # @src
        imul    rax, rdi, 52
        shr     rax, 2
        ret

Whereas the single-multiplication version ends up not even needing an imul:

tgt:                                    # @tgt
        lea     rax, [rdi + 2*rdi]
        lea     rax, [rdi + 4*rax]
        ret

So in general, %1 = mul nuw %0, CONST1; %2 = lshr exact %1, CONST2 should simplify to a single mul nuw whenever CONST1 is a multiple of 1 << CONST2.

In fact, it looks like this already happens in opt if written with div: https://alive2.llvm.org/ce/z/L-VouQ

  %1 = mul nuw i64 %0, 52
  %2 = udiv exact i64 %1, 4
=>
  %1 = mul nuw nsw i64 %0, 13

So that code path just needs to take into account places where the udiv -> lshr transformation had already happened.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions