-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Closed
Description
This is an attempt to organize myself and make what I plan to work on more visible
Weekly High Level Goals (in order)
- Arrow release: Release arrow-rs / parquet major version
55.0.0(Apr 2025) arrow-rs#7084 - Start working on testing the 47.0.0 release: Release DataFusion
47.0.0(April 2025) #15072 - TopK Pushdown Dynamic pruning filters from TopK state (optimize
ORDER BY LIMITqueries) #15037 with @adriangb - Avoid resorting/merging [EPIC] Avoid sort for already sorted Parquet files that do not overlap values on condition #6672 with @wiedld @suremarc and @xudong963
- Get Enable parquet filter pushdown (
filter_pushdown) by default #3463 ready for merge with @XiangpengHao - Work on integrating tpch data generator with @clflushopt : Make it easier to run TPCH queries with datafusion-cli #14608
Other projects I plan to review
- Bug fixes
- Performance improvements
- Complete insta test migration [Epic] Add snapshot tests (migrate to
instafor tests) #15178 with @blaginin @shruti2522 @qstommyshu and others - Hardening external sorts: A complete solution for stable and safe sort with spill #14692 with @2010YOUY01
- Set up Spark function library pattern: feat: Add
datafusion-sparkcrate #15168 with @shehabgamin and @andygrove - Use UTF8 view by default Change mapping of SQL
VARCHARfromUtf8toUtf8View#15096 with @zhuqi-lucas
Background
I am putting this list on github because:
- I like how github renders checklists w/ PR titles so it is easy to track (I currently have a local text file...)
- I thought others might be interested from seeing what I am doing / planning to do
- It makes me feel better that I don't have time to review all the PRs 😭
The way I am trying to prioritize PRs is in the following order
- Bug fixes
- Documentation / UX / API improvements (things that make DataFusion easier/better to work with)
- Performance improvements
- New features with wide appeal
- New functions
Note new features and functions are deliberately at the bottom
zachschuermann
Metadata
Metadata
Assignees
Labels
No labels