Skip to content

Add fuzz tests for SortMergeJoin spilling  #11541

Open
@comphead

Description

@comphead

Specifically, for testing, given the subtlety of the code involved I am not 100% sure it works for all corner cases. I suggest (as a follow on) we invest in fuzz testing both for SMJ in general as well as for spilling SMJ

https://github.com/apache/datafusion/blob/6c0e4fb5d9ac7a0a2f2b91f8b88d21f0bc0b4424/datafusion/core/tests/fuzz_cases/join_fuzz.rs#L50-L49

I think in particular, making sure we adjust the random inputs to have different numbers of repeated values (as the code in this PR is only going to be exercised when there are many of the same join keys I think)

Originally posted by @alamb in #11218 (review)

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions