Conversation
Co-authored-by: Yongting You <2010youy01@gmail.com>
|
Thank you @2010YOUY01 |
|
|
||
| ### Decimal32/Decimal64 support | ||
|
|
||
| The new Arrow types `Decimal32` and `Decimal64` are now supported in DataFusion |
comphead
left a comment
There was a problem hiding this comment.
Thanks @alamb and @2010YOUY01 IMO it is LGTM
Btw I found that release notes are now concise, easy to read and follow!
2010YOUY01
left a comment
There was a problem hiding this comment.
Great blog post, thank you!
| **Fewer object store round-trips for Parquet by default** | ||
|
|
||
| DataFusion now sets a default `metadata_size_hint` for [Apache Parquet] scans | ||
| ([#18118]), avoiding the extra | ||
| “last 8‑byte” request many clouds require to read file footers. Remote scans | ||
| typically drop from five requests to four per file, cutting latency and transfer | ||
| costs without any application changes. Thanks to [zhuqi-lucas] for leading this | ||
| effort. | ||
|
|
||
| [apache parquet]: https://parquet.apache.org/ | ||
|
|
There was a problem hiding this comment.
| **Fewer object store round-trips for Parquet by default** | |
| DataFusion now sets a default `metadata_size_hint` for [Apache Parquet] scans | |
| ([#18118]), avoiding the extra | |
| “last 8‑byte” request many clouds require to read file footers. Remote scans | |
| typically drop from five requests to four per file, cutting latency and transfer | |
| costs without any application changes. Thanks to [zhuqi-lucas] for leading this | |
| effort. | |
| [apache parquet]: https://parquet.apache.org/ |
I think this is a duplicate to the below 'Better Defaults for Remote Parquet Reads' section.
There was a problem hiding this comment.
That is a great catch -- I consolidated them in 33e4375
|
|
||
| We are proud to announce the release of [DataFusion 51.0.0]. This post highlights | ||
| some of the major improvements since [DataFusion 50.0.0]. The complete list of | ||
| changes is available in the [changelog]. Thanks to the [128 contributors] for |
There was a problem hiding this comment.
Indeed -- I think this is the part of this blog post I am most proud of
|
I plan to publish this tomorrow, 2025-11-25. Please let me know if anyone wants more time to review or has any additional commetns |
|
The blog post is now live: https://datafusion.apache.org/blog/2025/11/25/datafusion-51.0.0/ |
51.0.0release datafusion#18548See rendered preview: https://datafusion.staged.apache.org/blog/2025/11/25/datafusion-51.0.0/
For anyone curious, I asked
codexto draft this PR with the following prompt. It did a pretty good job for the rough draftDetails
We are going to write a blog post for the DataFusion 51.0.0 release
We need to cover the major features in this release. If you are unsure of any content, please leave a "TODO" note in the text and we can fill it in later.
I have copied the old release post here as a starting point:
content/blog/2025-11-25-datafusion-51.0.0.mdHere are the PRs this release (approx based on dates) - https://github.com/apache/datafusion/pulls?q=is%3Apr+merged%3A2025-09-16..2025-11-08
The changelog is here: https://github.com/apache/datafusion/blob/branch-51/dev/changelog/51.0.0.md
The list of major features can be found here apache/datafusion#17558 under the section "Features to mention in the blog (if they make it)"
(please only include the ones that made it into the release, with a checkmark)