Skip to content

Conversation

yhuang-db
Copy link
Contributor

What changes were proposed in this pull request?

This PR proposes to add a nullCounter associated with the Frequent Item Sketch in approx_top_k aggregation, so that now the function will return null item and null count if NULL value is among the top_k frequent items.

Why are the changes needed?

NULL value could be meaningful in some use cases and users might want to include NULL in the approx_top_k output.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

New unit tests on handling null values.

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the SQL label Oct 18, 2025
@HyukjinKwon HyukjinKwon changed the title [SPARK-53947][SQL]Count null in approx_top_k [SPARK-53947][SQL] Count null in approx_top_k Oct 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant