forked from apache/datafusion-comet
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
chore: Prepare 0.2.0 release (apache#866)
* update version in Dockerfile * use offical DataFusion release * Generate 0.2.0 changelog * regenreate after squashing commits * enable spark.comet.exec.enabled by default * update benchmarking configs * add new benchmark results * delete old benchmark results * add chart * Update docs/source/user-guide/configs.md Co-authored-by: Oleks V <comphead@users.noreply.github.com> * Update common/src/main/scala/org/apache/comet/CometConf.scala Co-authored-by: Oleks V <comphead@users.noreply.github.com> * tpc-ds results * fix links --------- Co-authored-by: Oleks V <comphead@users.noreply.github.com>
- Loading branch information
Showing
26 changed files
with
1,698 additions
and
328 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,146 @@ | ||
<!-- | ||
Licensed to the Apache Software Foundation (ASF) under one | ||
or more contributor license agreements. See the NOTICE file | ||
distributed with this work for additional information | ||
regarding copyright ownership. The ASF licenses this file | ||
to you under the Apache License, Version 2.0 (the | ||
"License"); you may not use this file except in compliance | ||
with the License. You may obtain a copy of the License at | ||
http://www.apache.org/licenses/LICENSE-2.0 | ||
Unless required by applicable law or agreed to in writing, | ||
software distributed under the License is distributed on an | ||
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
KIND, either express or implied. See the License for the | ||
specific language governing permissions and limitations | ||
under the License. | ||
--> | ||
|
||
# DataFusion Comet 0.2.0 Changelog | ||
|
||
This release consists of 87 commits from 14 contributors. See credits at the end of this changelog for more information. | ||
|
||
**Fixed bugs:** | ||
|
||
- fix: dictionary decimal vector optimization [#705](https://github.com/apache/datafusion-comet/pull/705) (kazuyukitanimura) | ||
- fix: Unsupported window expression should fall back to Spark [#710](https://github.com/apache/datafusion-comet/pull/710) (viirya) | ||
- fix: ReusedExchangeExec can be child operator of CometBroadcastExchangeExec [#713](https://github.com/apache/datafusion-comet/pull/713) (viirya) | ||
- fix: Fallback to Spark for window expression with range frame [#719](https://github.com/apache/datafusion-comet/pull/719) (viirya) | ||
- fix: Remove `skip.surefire.tests` mvn property [#739](https://github.com/apache/datafusion-comet/pull/739) (wForget) | ||
- fix: subquery execution under CometTakeOrderedAndProjectExec should not fail [#748](https://github.com/apache/datafusion-comet/pull/748) (viirya) | ||
- fix: skip negative scale checks for creating decimals [#723](https://github.com/apache/datafusion-comet/pull/723) (kazuyukitanimura) | ||
- fix: Fallback to Spark for unsupported partitioning [#759](https://github.com/apache/datafusion-comet/pull/759) (viirya) | ||
- fix: Unsupported types for SinglePartition should fallback to Spark [#765](https://github.com/apache/datafusion-comet/pull/765) (viirya) | ||
- fix: unwrap dictionaries in CreateNamedStruct [#754](https://github.com/apache/datafusion-comet/pull/754) (andygrove) | ||
- fix: Fallback to Spark for unsupported input besides ordering [#768](https://github.com/apache/datafusion-comet/pull/768) (viirya) | ||
- fix: Native window operator should be CometUnaryExec [#774](https://github.com/apache/datafusion-comet/pull/774) (viirya) | ||
- fix: Fallback to Spark when shuffling on struct with duplicate field name [#776](https://github.com/apache/datafusion-comet/pull/776) (viirya) | ||
- fix: withInfo was overwriting information in some cases [#780](https://github.com/apache/datafusion-comet/pull/780) (andygrove) | ||
- fix: Improve support for nested structs [#800](https://github.com/apache/datafusion-comet/pull/800) (eejbyfeldt) | ||
- fix: Sort on single struct should fallback to Spark [#811](https://github.com/apache/datafusion-comet/pull/811) (viirya) | ||
- fix: Check sort order of SortExec instead of child output [#821](https://github.com/apache/datafusion-comet/pull/821) (viirya) | ||
- fix: Fix panic in `avg` aggregate and disable `stddev` by default [#819](https://github.com/apache/datafusion-comet/pull/819) (andygrove) | ||
- fix: Supported nested types in HashJoin [#735](https://github.com/apache/datafusion-comet/pull/735) (eejbyfeldt) | ||
|
||
**Performance related:** | ||
|
||
- perf: Improve performance of CASE .. WHEN expressions [#703](https://github.com/apache/datafusion-comet/pull/703) (andygrove) | ||
- perf: Optimize IfExpr by delegating to CaseExpr [#681](https://github.com/apache/datafusion-comet/pull/681) (andygrove) | ||
- fix: optimize isNullAt [#732](https://github.com/apache/datafusion-comet/pull/732) (kazuyukitanimura) | ||
- perf: decimal decode improvements [#727](https://github.com/apache/datafusion-comet/pull/727) (parthchandra) | ||
- fix: Remove castting on decimals with a small precision to decimal256 [#741](https://github.com/apache/datafusion-comet/pull/741) (kazuyukitanimura) | ||
- fix: optimize some bit functions [#718](https://github.com/apache/datafusion-comet/pull/718) (kazuyukitanimura) | ||
- fix: Optimize getDecimal for small precision [#758](https://github.com/apache/datafusion-comet/pull/758) (kazuyukitanimura) | ||
- perf: add metrics to CopyExec and ScanExec [#778](https://github.com/apache/datafusion-comet/pull/778) (andygrove) | ||
- fix: Optimize decimal creation macros [#764](https://github.com/apache/datafusion-comet/pull/764) (kazuyukitanimura) | ||
- perf: Improve count aggregate performance [#784](https://github.com/apache/datafusion-comet/pull/784) (andygrove) | ||
- fix: Optimize read_side_padding [#772](https://github.com/apache/datafusion-comet/pull/772) (kazuyukitanimura) | ||
- perf: Remove some redundant copying of batches [#816](https://github.com/apache/datafusion-comet/pull/816) (andygrove) | ||
- perf: Remove redundant copying of batches after FilterExec [#835](https://github.com/apache/datafusion-comet/pull/835) (andygrove) | ||
- fix: Optimize CheckOverflow [#852](https://github.com/apache/datafusion-comet/pull/852) (kazuyukitanimura) | ||
- perf: Add benchmarks for Spark Scan + Comet Exec [#863](https://github.com/apache/datafusion-comet/pull/863) (andygrove) | ||
|
||
**Implemented enhancements:** | ||
|
||
- feat: Add support for time-zone, 3 & 5 digit years: Cast from string to timestamp. [#704](https://github.com/apache/datafusion-comet/pull/704) (akhilss99) | ||
- feat: Support count AggregateUDF for window function [#736](https://github.com/apache/datafusion-comet/pull/736) (huaxingao) | ||
- feat: Implement basic version of RLIKE [#734](https://github.com/apache/datafusion-comet/pull/734) (andygrove) | ||
- feat: show executed native plan with metrics when in debug mode [#746](https://github.com/apache/datafusion-comet/pull/746) (andygrove) | ||
- feat: Add GetStructField expression [#731](https://github.com/apache/datafusion-comet/pull/731) (Kimahriman) | ||
- feat: Add config to enable native upper and lower string conversion [#767](https://github.com/apache/datafusion-comet/pull/767) (andygrove) | ||
- feat: Improve native explain [#795](https://github.com/apache/datafusion-comet/pull/795) (andygrove) | ||
- feat: Add support for null literal with struct type [#797](https://github.com/apache/datafusion-comet/pull/797) (eejbyfeldt) | ||
- feat: Optimze CreateNamedStruct preserve dictionaries [#789](https://github.com/apache/datafusion-comet/pull/789) (eejbyfeldt) | ||
- feat: `CreateArray` support [#793](https://github.com/apache/datafusion-comet/pull/793) (Kimahriman) | ||
- feat: Add native thread configs [#828](https://github.com/apache/datafusion-comet/pull/828) (viirya) | ||
- feat: Add specific configs for converting Spark Parquet and JSON data to Arrow [#832](https://github.com/apache/datafusion-comet/pull/832) (andygrove) | ||
- feat: Support sum in window function [#802](https://github.com/apache/datafusion-comet/pull/802) (huaxingao) | ||
- feat: Simplify configs for enabling/disabling operators [#855](https://github.com/apache/datafusion-comet/pull/855) (andygrove) | ||
- feat: Enable `clippy::clone_on_ref_ptr` on `proto` and `spark_exprs` crates [#859](https://github.com/apache/datafusion-comet/pull/859) (comphead) | ||
- feat: Enable `clippy::clone_on_ref_ptr` on `core` crate [#860](https://github.com/apache/datafusion-comet/pull/860) (comphead) | ||
- feat: Use CometPlugin as main entrypoint [#853](https://github.com/apache/datafusion-comet/pull/853) (andygrove) | ||
|
||
**Documentation updates:** | ||
|
||
- doc: Update outdated spark.comet.columnar.shuffle.enabled configuration doc [#738](https://github.com/apache/datafusion-comet/pull/738) (wForget) | ||
- docs: Add explicit configs for enabling operators [#801](https://github.com/apache/datafusion-comet/pull/801) (andygrove) | ||
- doc: Document CometPlugin to start Comet in cluster mode [#836](https://github.com/apache/datafusion-comet/pull/836) (comphead) | ||
|
||
**Other:** | ||
|
||
- chore: Make rust clippy happy [#701](https://github.com/apache/datafusion-comet/pull/701) (Xuanwo) | ||
- chore: Update version to 0.2.0 and add 0.1.0 changelog [#696](https://github.com/apache/datafusion-comet/pull/696) (andygrove) | ||
- chore: Use rust-toolchain.toml for better toolchain support [#699](https://github.com/apache/datafusion-comet/pull/699) (Xuanwo) | ||
- chore(native): Make sure all targets in workspace been covered by clippy [#702](https://github.com/apache/datafusion-comet/pull/702) (Xuanwo) | ||
- Apache DataFusion Comet Logo [#697](https://github.com/apache/datafusion-comet/pull/697) (aocsa) | ||
- chore: Add logo to rat exclude list [#709](https://github.com/apache/datafusion-comet/pull/709) (andygrove) | ||
- chore: Use new logo in README and website [#724](https://github.com/apache/datafusion-comet/pull/724) (andygrove) | ||
- build: Add Comet logo files into exclude list [#726](https://github.com/apache/datafusion-comet/pull/726) (viirya) | ||
- chore: Remove TPC-DS benchmark results [#728](https://github.com/apache/datafusion-comet/pull/728) (andygrove) | ||
- chore: make Cast's logic reusable for other projects [#716](https://github.com/apache/datafusion-comet/pull/716) (Blizzara) | ||
- chore: move scalar_funcs into spark-expr [#712](https://github.com/apache/datafusion-comet/pull/712) (Blizzara) | ||
- chore: Bump DataFusion to rev 35c2e7e [#740](https://github.com/apache/datafusion-comet/pull/740) (andygrove) | ||
- chore: add more aggregate functions to benchmark test [#706](https://github.com/apache/datafusion-comet/pull/706) (huaxingao) | ||
- chore: Add criterion benchmark for decimal_div [#743](https://github.com/apache/datafusion-comet/pull/743) (andygrove) | ||
- build: Re-enable TPCDS q72 for broadcast and hash join configs [#781](https://github.com/apache/datafusion-comet/pull/781) (viirya) | ||
- chore: bump DataFusion to rev f4e519f [#783](https://github.com/apache/datafusion-comet/pull/783) (huaxingao) | ||
- chore: Upgrade to DataFusion rev bddb641 and disable "skip partial aggregates" feature [#788](https://github.com/apache/datafusion-comet/pull/788) (andygrove) | ||
- chore: Remove legacy code for adding a cast to a coalesce [#790](https://github.com/apache/datafusion-comet/pull/790) (andygrove) | ||
- chore: Use DataFusion 41.0.0-rc1 [#794](https://github.com/apache/datafusion-comet/pull/794) (andygrove) | ||
- chore: rename `CometRowToColumnar` and fix duplication bug [#785](https://github.com/apache/datafusion-comet/pull/785) (Kimahriman) | ||
- chore: Enable shuffle in micro benchmarks [#806](https://github.com/apache/datafusion-comet/pull/806) (andygrove) | ||
- Minor: ScanExec code cleanup and additional documentation [#804](https://github.com/apache/datafusion-comet/pull/804) (andygrove) | ||
- chore: Make it possible to run 'make benchmark-%' using jvm 17+ [#823](https://github.com/apache/datafusion-comet/pull/823) (eejbyfeldt) | ||
- chore: Add more unsupported cases to supportedSortType [#825](https://github.com/apache/datafusion-comet/pull/825) (viirya) | ||
- chore: Enable Comet shuffle with AQE coalesce partitions [#834](https://github.com/apache/datafusion-comet/pull/834) (viirya) | ||
- chore: Add GitHub workflow to publish Docker image [#847](https://github.com/apache/datafusion-comet/pull/847) (andygrove) | ||
- chore: Revert "fix: change the not exists base image apache/spark:3.4.3 to 3.4.2" [#854](https://github.com/apache/datafusion-comet/pull/854) (haoxins) | ||
- chore: fix docker-publish attempt 1 [#851](https://github.com/apache/datafusion-comet/pull/851) (andygrove) | ||
- minor: stop warning that AQEShuffleRead cannot run natively [#842](https://github.com/apache/datafusion-comet/pull/842) (andygrove) | ||
- chore: Improve ObjectHashAggregate fallback error message [#849](https://github.com/apache/datafusion-comet/pull/849) (andygrove) | ||
- chore: Fix docker image publishing (specify ghcr.io in tag) [#856](https://github.com/apache/datafusion-comet/pull/856) (andygrove) | ||
- chore: Use Git tag as Comet version when publishing Docker images [#857](https://github.com/apache/datafusion-comet/pull/857) (andygrove) | ||
|
||
## Credits | ||
|
||
Thank you to everyone who contributed to this release. Here is a breakdown of commits (PRs merged) per contributor. | ||
|
||
``` | ||
36 Andy Grove | ||
16 Liang-Chi Hsieh | ||
9 KAZUYUKI TANIMURA | ||
5 Emil Ejbyfeldt | ||
4 Huaxin Gao | ||
3 Adam Binford | ||
3 Oleks V | ||
3 Xuanwo | ||
2 Arttu | ||
2 Zhen Wang | ||
1 Akhil S S | ||
1 Alexander Ocsa | ||
1 Parth Chandra | ||
1 Xin Hao | ||
``` | ||
|
||
Thank you also to everyone who contributed in other ways such as filing issues, reviewing PRs, and providing feedback on this release. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Binary file removed
BIN
-25.8 KB
docs/source/_static/images/benchmark-results/2024-07-19/tpch_allqueries.png
Binary file not shown.
Binary file removed
BIN
-28.6 KB
docs/source/_static/images/benchmark-results/2024-07-19/tpch_queries_compare.png
Binary file not shown.
Binary file removed
BIN
-41.4 KB
docs/source/_static/images/benchmark-results/2024-07-19/tpch_queries_speedup.png
Binary file not shown.
Binary file added
BIN
+23.3 KB
docs/source/_static/images/benchmark-results/2024-08-23/tpcds_allqueries.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+37.4 KB
docs/source/_static/images/benchmark-results/2024-08-23/tpcds_queries_compare.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+61.5 KB
...ource/_static/images/benchmark-results/2024-08-23/tpcds_queries_speedup_abs.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+25.5 KB
docs/source/_static/images/benchmark-results/2024-08-23/tpch_allqueries.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+28.5 KB
docs/source/_static/images/benchmark-results/2024-08-23/tpch_queries_compare.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+42.2 KB
...source/_static/images/benchmark-results/2024-08-23/tpch_queries_speedup_rel.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
73 changes: 0 additions & 73 deletions
73
docs/source/contributor-guide/benchmark-results/2024-07-19/datafusion-tpch.json
This file was deleted.
Oops, something went wrong.
Oops, something went wrong.