feat: N-ary CASE WHEN expressions #20
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pull request updates how n-ary (multi-condition)
CASE WHENexpressions are handled in both benchmarks and DataFusion expression conversion, moving away from nested binary implementations to a flat, more scalable approach. It also expands benchmark coverage for large numbers of conditions and adds comprehensive tests for these scenarios.N-ary CASE WHEN support and benchmarks:
case_when_bench.rsto use the n-arycase_whenAPI instead of the old nested binary form, and increased the tested array sizes for more realistic performance evaluation. New benchmarks were added for 10 and 100 condition cases. [1] [2] [3] [4]nested_case_whenin favor ofcase_when, reflecting the new preferred implementation.DataFusion integration improvements:
case_whenexpressions (andcase_when_no_elsewhen there's no ELSE clause), instead of building nested binary trees. This simplifies the logic and improves performance for many-condition cases. [1] [2]Expanded test coverage:
CASE WHENexpressions with multiple conditions and forCASE WHENexpressions without an ELSE clause, ensuring correct handling and nullability semantics.These changes make the implementation more efficient and robust, especially when handling complex
CASE WHENexpressions with many conditions.