Add another loop unroll and enable VectorCombine #48271

gbaraldi · 2023-01-13T20:22:47Z

The regression in #48256 (comment) is only partially fixed by #48256. The first PR does enable the use of SIMD instructions via memmove, but it does not unroll the loop as it did before, that is for two reasons:

First we run LoopUnroll quite early in the pipeline so it sees more complicated code
Second, LLVM sees the runtime of our loops as weird and does not unroll unless we specifically allow runtime unrolling

This PR adds another LoopUnroll with runtime unrolling enabled after SLP vectorize, similar to how clang does.

Also enable VectorCombine because it does some nice things and was a todo for a couple of years.

We may also want to wait until we get LLVM15 because there might be a compile time hit for a second unrolling pass that was fixed on 15 https://discourse.llvm.org/t/loop-unrolling-in-large-functions-and-compile-time/67164

gbaraldi · 2023-01-13T20:27:59Z

It would be nice if loop vectorize could unroll these loops for us, but it seems it doesn't like doing it for things like memmove/memcpy.

gbaraldi · 2023-01-13T20:30:29Z

@nanosoldier runbenchmarks(ALL, vs=":master")

pchintalapudi · 2023-01-13T23:50:11Z

Could you update the newpm pipeline as well?

gbaraldi · 2023-01-13T23:57:37Z

I just want to do a little bit more testing to see if we aren't unrolling too too much. On my example it does an 8x unroll where clang does a 4x, so we might want to limit the pass a bit.

pchintalapudi · 2023-01-14T00:19:15Z

We'll see what happens after nanosoldier returns, but I do recall that runtime unrolling on newpm resulted in bad things happening on nanosoldier because it's doesn't have avx iirc (some 128-element vector appeared in the union benchmark I think)

nanosoldier · 2023-01-14T04:31:00Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

gbaraldi · 2023-01-14T15:47:12Z

I tried a couple of the tests locally and the regressions and improvements seem to be mostly correct. So I guess it's kind of tradeoff. I kinda wish we could limit how much this pass can unroll, basically making it so that it either does a 4x, or doesn't unroll at all.

pchintalapudi · 2023-01-14T16:12:27Z

Can we see the effect of just vectorcombine? That alone might give speedups that we want.

pchintalapudi · 2023-01-14T16:13:23Z

Also, loop-unroll does have tuning options via command line flags or constructor arguments.

gbaraldi · 2023-01-14T16:14:15Z

I did check, and some of speedups, for example in SIMD are due to the unrolling, the regressions in SIMD are also due to them.
Yeah, I was playing with the settings.

Add another loop unroll and enable Vector Combine

fa93bfe

gbaraldi requested review from vchuravy and vtjnash January 13, 2023 20:22

Merge branch 'master' into add-unroll2

048321c

gbaraldi closed this Jul 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add another loop unroll and enable VectorCombine #48271

Add another loop unroll and enable VectorCombine #48271

Uh oh!

gbaraldi commented Jan 13, 2023 •

edited

Loading

Uh oh!

gbaraldi commented Jan 13, 2023

Uh oh!

gbaraldi commented Jan 13, 2023

Uh oh!

pchintalapudi commented Jan 13, 2023

Uh oh!

gbaraldi commented Jan 13, 2023 •

edited

Loading

Uh oh!

pchintalapudi commented Jan 14, 2023

Uh oh!

nanosoldier commented Jan 14, 2023

Uh oh!

gbaraldi commented Jan 14, 2023

Uh oh!

pchintalapudi commented Jan 14, 2023

Uh oh!

pchintalapudi commented Jan 14, 2023

Uh oh!

gbaraldi commented Jan 14, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Add another loop unroll and enable VectorCombine #48271

Add another loop unroll and enable VectorCombine #48271

Uh oh!

Conversation

gbaraldi commented Jan 13, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gbaraldi commented Jan 13, 2023

Uh oh!

gbaraldi commented Jan 13, 2023

Uh oh!

pchintalapudi commented Jan 13, 2023

Uh oh!

gbaraldi commented Jan 13, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pchintalapudi commented Jan 14, 2023

Uh oh!

nanosoldier commented Jan 14, 2023

Uh oh!

gbaraldi commented Jan 14, 2023

Uh oh!

pchintalapudi commented Jan 14, 2023

Uh oh!

pchintalapudi commented Jan 14, 2023

Uh oh!

gbaraldi commented Jan 14, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

gbaraldi commented Jan 13, 2023 •

edited

Loading

gbaraldi commented Jan 13, 2023 •

edited

Loading