I have found that arrow-rs has supported reading virtual columns, including RowGroupIndex and RowNumber, since v57.0.0.
I propose to add a method with_virtual_columns to ParquetSource to let user decide which virtual columns should be read.
The virtual columns, like RowNumber, could be used to implement prewhere, late materialized topk, and secondary indexes and more fancy stuff.
virtual columns support in arrow-rs: apache/arrow-rs#8715