-
Notifications
You must be signed in to change notification settings - Fork 207
Closed
Description
planning to release v4.0.1:
New Feature
- Initial supports to ORC input file format.
- Initial supports to RSS framework and Apache Celeborn shuffle service.
Improvement
- Optimize AggExec by supporting Implement columnar-based aggregation.
- Use custom implemented hashmap implement for aggregation.
- Supports specialized count(0).
- Optimize bloom filter by reusing same bloom filter in the same executor.
- Optimize bloom filter by supporting shrinking.
- Optimize reading parquet files by supporting parallel reading.
- Improve spill file deletion logics.
Bug fixes
- Fix file not found for path with url encoded character.
- Fix Hashaggregate convert job throwing ScalaReflectionException.
- Fix pruning error while reading parquet files with multiple row groups.
- Fix incorrect number of tasks due to missing shuffleOrigin.
- Fix record batch creating error when hash joining with empty input.
Other
- Upgrade datafusion/arrow dependency to v42/v53.
- Replace gxhash with foldhash for better compatibility on some hardwares.
- Other minor improvement & fixes.
PRs
- AggExec: implement columnar accumulator states. by @richox in AggExec: implement columnar accumulator states. #646
- Bump bigdecimal from 0.4.5 to 0.4.6 by @dependabot in Bump bigdecimal from 0.4.5 to 0.4.6 #638
- Bump bytes from 1.7.2 to 1.8.0 by @dependabot in Bump bytes from 1.7.2 to 1.8.0 #625
- Bump bytes from 1.8.0 to 1.9.0 by @dependabot in Bump bytes from 1.8.0 to 1.9.0 #671
- Bump object_store from 0.11.0 to 0.11.1 by @dependabot in Bump object_store from 0.11.0 to 0.11.1 #622
- Bump sonic-rs from 0.3.13 to 0.3.14 by @dependabot in Bump sonic-rs from 0.3.13 to 0.3.14 #623
- Bump sonic-rs from 0.3.14 to 0.3.16 by @dependabot in Bump sonic-rs from 0.3.14 to 0.3.16 #647
- Bump tempfile from 3.13.0 to 3.14.0 by @dependabot in Bump tempfile from 3.13.0 to 3.14.0 #641
- Bump tokio from 1.40.0 to 1.41.0 by @dependabot in Bump tokio from 1.40.0 to 1.41.0 #629
- Bump tokio from 1.41.0 to 1.41.1 by @dependabot in Bump tokio from 1.41.0 to 1.41.1 #642
- Bump tokio from 1.41.0 to 1.41.1 by @dependabot in Bump tokio from 1.41.0 to 1.41.1 #676
- Bump uuid from 1.10.0 to 1.11.0 by @dependabot in Bump uuid from 1.10.0 to 1.11.0 #618
- Create RecordBatch with num_rows option to avoid bhj error caused by empty output_schema by @wForget in Create RecordBatch with num_rows option to avoid bhj error caused by empty output_schema #683
- Fix build on windows by @wForget in Fix build on windows #666
- Fix file not found for path with url encoded character by @wForget in Fix file not found for path with url encoded character #679
- Followup to Introduce base blaze sql test suite #674, add -r for rm by @wForget in Followup to #674, add -r for rm #681
- Introduce base blaze sql test suite by @wForget in Introduce base blaze sql test suite #674
- [BLAZE-287][FOLLOWUP] Use JavaUtils#newConcurrentHashMap to speed up ConcurrentHashMap#computeIfAbsent by @SteNicholas in [BLAZE-287][FOLLOWUP] Use JavaUtils#newConcurrentHashMap to speed up ConcurrentHashMap#computeIfAbsent #615
- [BLAZE-573][FOLLOWUP] Bump Spark from 3.4.3 to 3.4.4 by @SteNicholas in [BLAZE-573][FOLLOWUP] Bump Spark from 3.4.3 to 3.4.4 #640
- [BLAZE-627] Make ORC and Parquet format detection more generic by @dixingxing0 in [BLAZE-627] Make ORC and Parquet format detection more generic #628
- [BLAZE-664] Bump Celeborn version from 0.5.1 to 0.5.2 by @SteNicholas in [BLAZE-664] Bump Celeborn version from 0.5.1 to 0.5.2 #665
- [MINOR] Avoid NPE when native lib is not found by @wForget in [MINOR] Avoid NPE when native lib is not found #668
- add new blaze logo by @richox in add new blaze logo #633
- chore: Make spotless plugin happy by @zuston in chore: Make spotless plugin happy #653
- code refactoring by @richox in code refactoring #658
- code refactoring by @richox in code refactoring #677
- doc: update tpc-h benchmark result by @richox in doc: update tpc-h benchmark result #614
- fix Hashaggregate convert job throw ScalaReflectionException by @leizhang5s in fix Hashaggregate convert job throw ScalaReflectionException #637
- fix pruning error while reading parquet files with multiple row groups by @richox in fix pruning error while reading parquet files with multiple row groups #616
- fix running error for Spark 3.2.0 and 3.2.1 by @XorSum in fix running error for Spark 3.2.0 and 3.2.1 #602
- fix(shuffle): Progagate shuffle origin to native exchange exec to make AQE rebalance valid by @zuston in fix(shuffle): Progagate shuffle origin to native exchange exec to make AQE rebalance valid #663
- fix(spill): Delete spill file when dropping for rust FileSpill by @zuston in fix(spill): Delete spill file when dropping for rust FileSpill #660
- fix(spill): Explicitly delete spill file for FileBasedSpillBuf after release by @zuston in fix(spill): Explicitly delete spill file for FileBasedSpillBuf after release #654
- improve NativeOrcScan by @richox in improve NativeOrcScan #631
- improve memory management by @richox in improve memory management #621
- improvement: Add numOfPartitions metrics for exchange exec to align with vanilla spark by @zuston in improvement: Add numOfPartitions metrics for exchange exec to align with vanilla spark #669
- optimize bloom filter by @richox in optimize bloom filter #620
- parquet reading improvements by @richox in parquet reading improvements #650
- release version v4.0.0 by @richox in release version v4.0.0 #613
- replace gxhash with foldhash by @richox in replace gxhash with foldhash #624
- supports specialized count(0) by @richox in supports specialized count(0) #619
- tpcd benchmarkrunner : add orc format support by @leizhang5s in tpcd benchmarkrunner : add orc format support #639
- update to datafusion-v42 by @richox in update to datafusion-v42 #574
- use custom implemented hashmap for aggregation by @richox in use custom implemented hashmap for aggregation #617
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels