Skip to content

try_to_fold_vector_reduce<Call> has incorrect behavior #6883

Closed
@rootjalex

Description

@rootjalex

In CodeGen_LLVM, we call try_to_fold_vector_reduce<Call> on saturating_add or saturating_sub calls, while not providing information as to whether or not the accumulation is an addition or a subtraction:

https://github.com/halide/Halide/blob/11a049c3967a277173e288ffd802f08ce1a1b78e/src/CodeGen_LLVM.cpp#L2835-#L2839

This seems like incorrect behavior - I noticed this while restructuring CodeGen_X86 into separate optimization and code generation passes, because it appears that the accumulating saturating dot product instructions should trigger on both of these patterns:

saturating_sub(wild_i32x, VectorReduce(SaturatingAdd, factor=4, widening_mul(wild_i16x, wild_i16x)))
saturating_add(wild_i32x, VectorReduce(SaturatingAdd, factor=4, widening_mul(wild_i16x, wild_i16x)))

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions