Is your feature request related to a problem or challenge?
FileScanConfig in datafusion/datasource/src/file_scan_config.rs has grown large after the sort pushdown optimization (#21182) added statistics-based file sorting, non-overlapping validation, and NULL handling logic.
As noted by @alamb in #21182 (comment):
As a follow on PR it might be nice to figure out how to move some of this code out of FileScanConfig and into some other smaller module
Describe the solution you'd like
Extract sort pushdown related code from FileScanConfig into a dedicated module, e.g. datafusion/datasource/src/sort_pushdown.rs:
try_pushdown_sort()
rebuild_with_source()
try_sort_file_groups_by_statistics()
sort_files_within_groups_by_statistics()
any_file_has_nulls_in_sort_columns()
- Related helper functions and types (
SortedFileGroups, etc.)
This is a pure refactor — no behavior changes.
Related issues:
Is your feature request related to a problem or challenge?
FileScanConfigindatafusion/datasource/src/file_scan_config.rshas grown large after the sort pushdown optimization (#21182) added statistics-based file sorting, non-overlapping validation, and NULL handling logic.As noted by @alamb in #21182 (comment):
Describe the solution you'd like
Extract sort pushdown related code from
FileScanConfiginto a dedicated module, e.g.datafusion/datasource/src/sort_pushdown.rs:try_pushdown_sort()rebuild_with_source()try_sort_file_groups_by_statistics()sort_files_within_groups_by_statistics()any_file_has_nulls_in_sort_columns()SortedFileGroups, etc.)This is a pure refactor — no behavior changes.
Related issues: