Skip to content

Commit 0064565

Browse files
dtcxzywtstellar
authored andcommitted
[DAGCombiner] Don't ignore N2's undef elements in foldVSelectOfConstants (llvm#129272)
Since N2 will be reused in the fold, we cannot skip N2's undef elements if the corresponding element in N1 is well-defined. For example: ``` t2: v4i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0> t24: v4i32 = BUILD_VECTOR undef:i32, undef:i32, Constant:i32<1>, undef:i32 t11: v4i32 = vselect t8, t2, t10 ``` Before this patch, we fold t11 into: ``` t26: v4i32 = sign_extend t8 t27: v4i32 = add t26, t24 ``` The last element of t27 is incorrect. Closes llvm#129181. (cherry picked from commit 2709366)
1 parent 54c90e4 commit 0064565

File tree

2 files changed

+21
-2
lines changed

2 files changed

+21
-2
lines changed

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12547,9 +12547,10 @@ SDValue DAGCombiner::foldVSelectOfConstants(SDNode *N) {
1254712547
for (unsigned i = 0; i != Elts; ++i) {
1254812548
SDValue N1Elt = N1.getOperand(i);
1254912549
SDValue N2Elt = N2.getOperand(i);
12550-
if (N1Elt.isUndef() || N2Elt.isUndef())
12550+
if (N1Elt.isUndef())
1255112551
continue;
12552-
if (N1Elt.getValueType() != N2Elt.getValueType()) {
12552+
// N2 should not contain undef values since it will be reused in the fold.
12553+
if (N2Elt.isUndef() || N1Elt.getValueType() != N2Elt.getValueType()) {
1255312554
AllAddOne = false;
1255412555
AllSubOne = false;
1255512556
break;

llvm/test/CodeGen/X86/vselect-constants.ll

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -302,3 +302,21 @@ define i32 @wrong_min_signbits(<2 x i16> %x) {
302302
%t1 = bitcast <2 x i16> %sel to i32
303303
ret i32 %t1
304304
}
305+
306+
define i32 @pr129181() {
307+
; SSE-LABEL: pr129181:
308+
; SSE: # %bb.0: # %entry
309+
; SSE-NEXT: xorl %eax, %eax
310+
; SSE-NEXT: retq
311+
;
312+
; AVX-LABEL: pr129181:
313+
; AVX: # %bb.0: # %entry
314+
; AVX-NEXT: xorl %eax, %eax
315+
; AVX-NEXT: retq
316+
entry:
317+
%x = insertelement <4 x i32> zeroinitializer, i32 0, i32 0
318+
%cmp = icmp ult <4 x i32> %x, splat (i32 1)
319+
%sel = select <4 x i1> %cmp, <4 x i32> zeroinitializer, <4 x i32> <i32 0, i32 0, i32 1, i32 poison>
320+
%reduce = tail call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> %sel)
321+
ret i32 %reduce
322+
}

0 commit comments

Comments
 (0)