-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Open
Description
This is my weekly plan, mostly for my own organizational needs (as I am dropping too many things). I am making it public in the hopes that helps others to see what I am working on -- also I spend so much time in github the interface is very familiar to me and I can cross link all the issues I am working
(it is also my excuse as to why I haven't reviewed many good looking PRs)
I will update this list as I run into more things
PR review queue (rough order)
- Normalize partitioned and flat object listing #18146
- add
try_new_with_lengthconstructor toFixedSizeListarrow-rs#8624 - Refactor distinct aggregate implementations to use common buffer #18348
- Fix: ViewType gc on huge batch would produce bad output arrow-rs#8694
- Refactor create_hashes to accept array references #18448
- Refactor InListExpr to store arrays and support structs #18449
- Refactor state management in
HashJoinExecand use CASE expressions for more precise filters #18451 - Support Arrow IPC Stream Files #18457
- Add FilterBuilder::is_optimize_beneficial arrow-rs#8782
- Add
mergeandmerge_nalgorithms arrow-rs#8753 - CI: add
clippy::needless_pass_by_valuerule #18468 - Add relation planner extension support #17843
What am I personally working on now
- DataFusion meetup prep next week: DISCUSSION: DataFusion Meetup in Boston, USA - Nov 12, 2025 #16703
- Internal error: Physical input schema should be the same as the one converted from logical input schema. #18337 (and related fallout)
- Parquet push decoder: Rewrite
ParquetRecordBatchStreamin terms of the PushDecoder arrow-rs#8159
Projects I am supporting actively (high on my priority list)
- Next DF release: Release DataFusion
51.0.0(Nov 2025) #17558 with @xudong963 - Adaptive parquet predicate pushdown evaluation faster with @hhhizzz [Parquet] Adaptive Parquet Predicate Pushdown arrow-rs#8733
- Low level arrow bit manipulation fast with @rluvaton feat: add
MutableBuffer::apply_unary_opandMutableBuffer::apply_binary_oparrow-rs#8619 - DataFusion object store requests go faster with @BlakeOrth [EPIC] ListingTable object store usage improvements #17214
- Release object store 0.13.0: Release object store
0.13.0(breaking) - Target Nov 2025 arrow-rs-object-store#367 - Improve DataFusion ClickBench performance: [EPIC] Make DataFusion the top of the ClickBench Parquet leaderboard #18489
Projects on my backlog
These are ones I would like to support but don't have the capacity at the moment to push, in relative order
- row numbers in parquet Support file row number in Parquet reader arrow-rs#7299 with @jkylling and @vustef
- Help integrate Variant with @friendlymatthew [EPIC] Support
VARIANTtype for unstructured data #16116 - Epic: Join Order Enumeration #18249 from @NGA-TRAN
PRs that look great but need a thorough review (looking for help here 🎣 from anyone else)
- Examples of extending SQL syntax #17824 from the @theirix
- external tables for multiple locations: feat(cli): support external tables on multiple locations #17702
- relation extension planner: Add relation planner extension support #17843
- writing REE arrays to parquet: Support writing RunEndEncoded as Parquet arrow-rs#8069
comphead, sunng87, 2010YOUY01, feniljain and vegarsti
Metadata
Metadata
Assignees
Labels
No labels