Skip to content

Predicate push-down into parquet broken for Date32 columns #649

@yordan-pavlov

Description

@yordan-pavlov

Describe the bug
Earlier this week I found that predicate push-down into parquet for Date32 columns was broken in PR #426

I found that this was caused by missing branches in impl TryFrom<&DataType> for ScalarValue here https://github.com/apache/arrow-datafusion/blob/master/datafusion/src/scalar.rs#L924
which is used in get_min_max_values here https://github.com/apache/arrow-datafusion/blob/master/datafusion/src/physical_plan/parquet.rs#L508

I also found that adding the following lines into the try_from method resolves the issue:

DataType::Date32 => ScalarValue::Date32(None),
DataType::Date64 => ScalarValue::Date64(None),

To Reproduce

  • filter Date32 column in a parquet data source
  • the statistics column(s) generated for the filtered Date32 columns will be all null

Expected behavior
Statistics column(s) generated for Date32 columns from a parquet data source should not be all null

Additional context
n/a

@alamb

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions