CPU PJRT: reduce max of NaNs different if the value is a constant or if the value comes from a parameter #21461
Description
I know, it's an odd bug, and it only happens in the CPU PJRT (compiled last month, see commit hash). Using the CUDA PJRT it works as expected. I found out when writing some edge case unit tests.
Since I maintain the Go wrapper to XlaBuilder (see GoMLX), here are the two versions of the simple to read Go code, one using constant and the other using as input a parameter -- it should be easy to read. I include in the bottom the HloModuleProto
(not as readable) for both cases:
// It works if input is passed as a constant:
{
got := ExecOnce(backend, func(g *Graph) *Node {
return ReduceMax(Const(g, []float64{math.NaN(), 1}))
})
// got is NaN as expected
}
Now the buggy version, which returns 1 (it should return NaN
), when passing the NaN
as a parameter -- so it cannot be optimized away as a constant:
got := ExecOnce(backend, func(x *Node) *Node {
return ReduceMax(x)
}, []float64{math.NaN(), 1})
// got is 1 !?!?!
Here is the resulting HLOModuleProto
of the first version (input is a constant), using float32
, where
ReduceMax({NaN, 1}) == NaN
as expected.
name: "TestReduce-ReduceMax with NaN as constant.8"
entry_computation_name: "TestReduce-ReduceMax with NaN as constant.8"
computations {
name: "#_ReduceMaxType_Float32.3"
instructions {
name: "lhs.4"
opcode: "parameter"
shape {
element_type: F32
layout {
tail_padding_alignment_in_elements: 1
}
}
metadata {
}
id: 4
frontend_attributes {
}
}
instructions {
name: "rhs.5"
opcode: "parameter"
shape {
element_type: F32
layout {
tail_padding_alignment_in_elements: 1
}
}
metadata {
}
parameter_number: 1
id: 5
frontend_attributes {
}
}
instructions {
name: "maximum.6"
opcode: "maximum"
shape {
element_type: F32
layout {
tail_padding_alignment_in_elements: 1
}
}
metadata {
}
id: 6
operand_ids: 4
operand_ids: 5
frontend_attributes {
}
}
program_shape {
parameters {
element_type: F32
layout {
tail_padding_alignment_in_elements: 1
}
}
parameters {
element_type: F32
layout {
tail_padding_alignment_in_elements: 1
}
}
result {
element_type: F32
layout {
tail_padding_alignment_in_elements: 1
}
}
parameter_names: "lhs"
parameter_names: "rhs"
}
id: 3
root_id: 6
}
computations {
name: "TestReduce-ReduceMax with NaN as constant.8"
instructions {
name: "constant.1"
opcode: "constant"
shape {
element_type: F32
dimensions: 2
layout {
minor_to_major: 0
tail_padding_alignment_in_elements: 1
}
is_dynamic_dimension: false
}
metadata {
}
literal {
shape {
element_type: F32
dimensions: 2
layout {
minor_to_major: 0
tail_padding_alignment_in_elements: 1
}
is_dynamic_dimension: false
}
f32s: nan
f32s: 1
}
id: 1
frontend_attributes {
}
}
instructions {
name: "constant.2"
opcode: "constant"
shape {
element_type: F32
layout {
tail_padding_alignment_in_elements: 1
}
}
metadata {
}
literal {
shape {
element_type: F32
layout {
tail_padding_alignment_in_elements: 1
}
}
f32s: -inf
}
id: 2
frontend_attributes {
}
}
instructions {
name: "reduce.7"
opcode: "reduce"
shape {
element_type: F32
layout {
tail_padding_alignment_in_elements: 1
}
}
metadata {
}
dimensions: 0
id: 7
operand_ids: 1
operand_ids: 2
called_computation_ids: 3
frontend_attributes {
}
}
program_shape {
result {
element_type: F32
layout {
tail_padding_alignment_in_elements: 1
}
}
}
id: 8
root_id: 7
}
host_program_shape {
result {
element_type: F32
layout {
tail_padding_alignment_in_elements: 1
}
}
}
id: 8
entry_computation_id: 8
And here is the HLO version where the input is a parameter, so the const
cannot be optimized away, and where it returns 1 (that is, ReduceMax({NaN, 1}) == 1!?
)
name: "TestReduce-ReduceMax with NaN as parameter.8"
entry_computation_name: "TestReduce-ReduceMax with NaN as parameter.8"
computations {
name: "#_ReduceMaxType_Float32.3"
instructions {
name: "lhs.4"
opcode: "parameter"
shape {
element_type: F32
layout {
tail_padding_alignment_in_elements: 1
}
}
metadata {
}
id: 4
frontend_attributes {
}
}
instructions {
name: "rhs.5"
opcode: "parameter"
shape {
element_type: F32
layout {
tail_padding_alignment_in_elements: 1
}
}
metadata {
}
parameter_number: 1
id: 5
frontend_attributes {
}
}
instructions {
name: "maximum.6"
opcode: "maximum"
shape {
element_type: F32
layout {
tail_padding_alignment_in_elements: 1
}
}
metadata {
}
id: 6
operand_ids: 4
operand_ids: 5
frontend_attributes {
}
}
program_shape {
parameters {
element_type: F32
layout {
tail_padding_alignment_in_elements: 1
}
}
parameters {
element_type: F32
layout {
tail_padding_alignment_in_elements: 1
}
}
result {
element_type: F32
layout {
tail_padding_alignment_in_elements: 1
}
}
parameter_names: "lhs"
parameter_names: "rhs"
}
id: 3
root_id: 6
}
computations {
name: "TestReduce-ReduceMax with NaN as parameter.8"
instructions {
name: "x.1"
opcode: "parameter"
shape {
element_type: F32
dimensions: 2
layout {
minor_to_major: 0
tail_padding_alignment_in_elements: 1
}
is_dynamic_dimension: false
}
metadata {
}
id: 1
frontend_attributes {
}
}
instructions {
name: "constant.2"
opcode: "constant"
shape {
element_type: F32
layout {
tail_padding_alignment_in_elements: 1
}
}
metadata {
}
literal {
shape {
element_type: F32
layout {
tail_padding_alignment_in_elements: 1
}
}
f32s: -inf
}
id: 2
frontend_attributes {
}
}
instructions {
name: "reduce.7"
opcode: "reduce"
shape {
element_type: F32
layout {
tail_padding_alignment_in_elements: 1
}
}
metadata {
}
dimensions: 0
id: 7
operand_ids: 1
operand_ids: 2
called_computation_ids: 3
frontend_attributes {
}
}
program_shape {
parameters {
element_type: F32
dimensions: 2
layout {
minor_to_major: 0
tail_padding_alignment_in_elements: 1
}
is_dynamic_dimension: false
}
result {
element_type: F32
layout {
tail_padding_alignment_in_elements: 1
}
}
parameter_names: "x"
}
id: 8
root_id: 7
}
host_program_shape {
parameters {
element_type: F32
dimensions: 2
layout {
minor_to_major: 0
tail_padding_alignment_in_elements: 1
}
is_dynamic_dimension: false
}
result {
element_type: F32
layout {
tail_padding_alignment_in_elements: 1
}
}
parameter_names: "x"
}
id: 8
entry_computation_id: 8