ParquetRecordBatchStreamBuilder::new()
panics instead of erroring out when opening a corrupted file
#5315
Labels
enhancement
Any new improvement worthy of a entry in the changelog
good first issue
Good for newcomers
help wanted
parquet
Changes to the parquet crate
When opening a corrupted Parquet file where one of the row groups is missing a column,
ParquetRecordBatchStreamBuilder::new()
panics instead of returning an error:This is due to the
RowGroupMetaData::from_thrift()
validating column counts using theassert_eq!()
macro:arrow-rs/parquet/src/file/metadata.rs
Line 352 in 639e81e
Could the check be replaced by an
if
statement returning an error? If that sounds OK, I can prepare a PR for review.As workaround, it is possible to unwind the panic and handle it almost like a regular error. Unwinding async code is not straightforward though and it would be simpler to allow consistent error handling.
The text was updated successfully, but these errors were encountered: