Skip to content

Add with_virtual_columns to ParquetSource for reading virtual columns #20132

@niebayes

Description

@niebayes

I have found that arrow-rs has supported reading virtual columns, including RowGroupIndex and RowNumber, since v57.0.0.
I propose to add a method with_virtual_columns to ParquetSource to let user decide which virtual columns should be read.
The virtual columns, like RowNumber, could be used to implement prewhere, late materialized topk, and secondary indexes and more fancy stuff.

virtual columns support in arrow-rs: apache/arrow-rs#8715

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions