-
Couldn't load subscription status.
- Fork 3.9k
Description
Currently format specific scan options are embedded as members of the corresponding subclass of FileFormat. Extracting these to an options struct would provide better separation of concerns; currently the only way to scan a parquet formatted dataset with different options is to reconstruct it in a differently optioned format from its component files.
CsvFileFormat could retain ParseOptions as a member, since (for example) tab-separated vs comma-separated values can justifiably be considered different formats.
Reporter: Ben Kietzman / @bkietz
Assignee: David Li / @lidavidm
Related issues:
- [C++][Dataset] Add ConvertOptions and ReadOptions to CsvFileFormat (relates to)
- [GLib] Add CsvFragmentScanOption support (is related to)
- [R] Accept format-specific scan options in collect() (is related to)
- [C++][Dataset] Extract IpcFragmentScanOptions, ParquetFragmentScanOptions (is related to)
PRs and other links:
Note: This issue was originally created as ARROW-9749. Please see the migration documentation for further details.