Skip to content

count distinct on NaN produces incorrect results #1238

@andygrove

Description

@andygrove

Describe the bug

In #1209 we now fall back to columnar shuffle in some cases where native shuffle is not supported, rather than just falling back to Spark. This caused some test failures in Spark SQL tests such as the following test that now produces incorrect results. This could potentially be a bug in columnar shuffle that we have not seen before.

This is from SPARK-32038: NormalizeFloatingNumbers should work on distinct aggregate.

2025-01-08_15-36

Steps to reproduce

No response

Expected behavior

No response

Additional context

No response

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions