Skip to content

Commit

Permalink
Merge remote-tracking branch 'public/changelog-0.3.0' into branch-0.3
Browse files Browse the repository at this point in the history
  • Loading branch information
andygrove committed Sep 24, 2024
2 parents 56764d2 + 5a84b35 commit 3783faa
Showing 1 changed file with 115 additions and 0 deletions.
115 changes: 115 additions & 0 deletions dev/changelog/0.3.0.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->

# DataFusion Comet 0.3.0 Changelog

This release consists of 57 commits from 12 contributors. See credits at the end of this changelog for more information.

**Fixed bugs:**

- fix: Support type coercion for ScalarUDFs [#865](https://github.com/apache/datafusion-comet/pull/865) (Kimahriman)
- fix: CometTakeOrderedAndProjectExec native scan node should use child operator's output [#896](https://github.com/apache/datafusion-comet/pull/896) (viirya)
- fix: Fix various memory leaks problems [#890](https://github.com/apache/datafusion-comet/pull/890) (Kontinuation)
- fix: Add output to Comet operators equal and hashCode [#902](https://github.com/apache/datafusion-comet/pull/902) (viirya)
- fix: Fallback to Spark when cannot resolve AttributeReference [#926](https://github.com/apache/datafusion-comet/pull/926) (viirya)
- fix: Fix memory bloat caused by holding too many unclosed `ArrowReaderIterator`s [#929](https://github.com/apache/datafusion-comet/pull/929) (Kontinuation)
- fix: Normalize NaN and zeros for floating number comparison [#953](https://github.com/apache/datafusion-comet/pull/953) (viirya)
- fix: window function range offset should be long instead of int [#733](https://github.com/apache/datafusion-comet/pull/733) (huaxingao)
- fix: CometScanExec on Spark 3.5.2 [#915](https://github.com/apache/datafusion-comet/pull/915) (Kimahriman)
- fix: div and rem by negative zero [#960](https://github.com/apache/datafusion-comet/pull/960) (kazuyukitanimura)

**Performance related:**

- perf: Optimize CometSparkToColumnar for columnar input [#892](https://github.com/apache/datafusion-comet/pull/892) (mbutrovich)
- perf: Fall back to Spark if query uses DPP with v1 data sources [#897](https://github.com/apache/datafusion-comet/pull/897) (andygrove)
- perf: Report accurate total time for scans [#916](https://github.com/apache/datafusion-comet/pull/916) (andygrove)
- perf: Add metric for time spent casting in native scan [#919](https://github.com/apache/datafusion-comet/pull/919) (andygrove)
- perf: Add criterion benchmark for aggregate expressions [#948](https://github.com/apache/datafusion-comet/pull/948) (andygrove)
- perf: Add metric for time spent in CometSparkToColumnarExec [#931](https://github.com/apache/datafusion-comet/pull/931) (mbutrovich)
- perf: Optimize decimal precision check in decimal aggregates (sum and avg) [#952](https://github.com/apache/datafusion-comet/pull/952) (andygrove)

**Implemented enhancements:**

- feat: Add config option to enable converting CSV to columnar [#871](https://github.com/apache/datafusion-comet/pull/871) (andygrove)
- feat: Implement basic version of string to float/double/decimal [#870](https://github.com/apache/datafusion-comet/pull/870) (andygrove)
- feat: Implement to_json for subset of types [#805](https://github.com/apache/datafusion-comet/pull/805) (andygrove)
- feat: Add ShuffleQueryStageExec to direct child node for CometBroadcastExchangeExec [#880](https://github.com/apache/datafusion-comet/pull/880) (viirya)
- feat: Support sort merge join with a join condition [#553](https://github.com/apache/datafusion-comet/pull/553) (viirya)
- feat: Array element extraction [#899](https://github.com/apache/datafusion-comet/pull/899) (Kimahriman)
- feat: date_add and date_sub functions [#910](https://github.com/apache/datafusion-comet/pull/910) (mbutrovich)
- feat: implement scripts for binary release build [#932](https://github.com/apache/datafusion-comet/pull/932) (parthchandra)
- feat: Publish artifacts to maven [#946](https://github.com/apache/datafusion-comet/pull/946) (parthchandra)

**Documentation updates:**

- doc: Documenting Helm chart for Comet Kube execution [#874](https://github.com/apache/datafusion-comet/pull/874) (comphead)
- doc: Update native code path in development [#921](https://github.com/apache/datafusion-comet/pull/921) (viirya)
- docs: Add more detailed architecture documentation [#922](https://github.com/apache/datafusion-comet/pull/922) (andygrove)

**Other:**

- chore: Update installation.md [#869](https://github.com/apache/datafusion-comet/pull/869) (haoxins)
- chore: Update versions to 0.3.0 / 0.3.0-SNAPSHOT [#868](https://github.com/apache/datafusion-comet/pull/868) (andygrove)
- chore: Add documentation on running benchmarks with Microk8s [#848](https://github.com/apache/datafusion-comet/pull/848) (andygrove)
- chore: Improve CometExchange metrics [#873](https://github.com/apache/datafusion-comet/pull/873) (viirya)
- chore: Add spilling metrics of SortMergeJoin [#878](https://github.com/apache/datafusion-comet/pull/878) (viirya)
- chore: change shuffle mode default from jvm to auto [#877](https://github.com/apache/datafusion-comet/pull/877) (andygrove)
- chore: Enable shuffle by default [#881](https://github.com/apache/datafusion-comet/pull/881) (andygrove)
- chore: print Comet native version to logs after Comet is initialized [#900](https://github.com/apache/datafusion-comet/pull/900) (SemyonSinchenko)
- chore: Revise batch pull approach to more follow C Data interface semantics [#893](https://github.com/apache/datafusion-comet/pull/893) (viirya)
- chore: Close dictionary provider when iterator is closed [#904](https://github.com/apache/datafusion-comet/pull/904) (andygrove)
- chore: Remove unused function [#906](https://github.com/apache/datafusion-comet/pull/906) (viirya)
- chore: Upgrade to latest DataFusion revision [#909](https://github.com/apache/datafusion-comet/pull/909) (andygrove)
- build: fix build [#917](https://github.com/apache/datafusion-comet/pull/917) (andygrove)
- chore: Revise array import to more follow C Data Interface semantics [#905](https://github.com/apache/datafusion-comet/pull/905) (viirya)
- chore: Address reviews [#920](https://github.com/apache/datafusion-comet/pull/920) (viirya)
- chore: Enable Comet shuffle for Spark core-1 test [#924](https://github.com/apache/datafusion-comet/pull/924) (viirya)
- build: Add maven-compiler-plugin for java cross-build [#911](https://github.com/apache/datafusion-comet/pull/911) (viirya)
- build: Disable upload-test-reports for macos-13 runner [#933](https://github.com/apache/datafusion-comet/pull/933) (viirya)
- minor: cast timestamp test #468 [#923](https://github.com/apache/datafusion-comet/pull/923) (himadripal)
- build: Set Java version arg for scala-maven-plugin [#934](https://github.com/apache/datafusion-comet/pull/934) (viirya)
- chore: Remove redundant RowToColumnar from CometShuffleExchangeExec for columnar shuffle [#944](https://github.com/apache/datafusion-comet/pull/944) (viirya)
- minor: rename CometMetricNode `add` to `set` and update documentation [#940](https://github.com/apache/datafusion-comet/pull/940) (andygrove)
- chore: Add config for enabling SMJ with join condition [#937](https://github.com/apache/datafusion-comet/pull/937) (andygrove)
- chore: Change maven group ID to `org.apache.datafusion` [#941](https://github.com/apache/datafusion-comet/pull/941) (andygrove)
- chore: Upgrade to DataFusion 42.0.0 [#945](https://github.com/apache/datafusion-comet/pull/945) (andygrove)
- build: Fix regression in jar packaging [#950](https://github.com/apache/datafusion-comet/pull/950) (andygrove)
- chore: Show reason for falling back to Spark when SMJ with join condition is not enabled [#956](https://github.com/apache/datafusion-comet/pull/956) (andygrove)
- chore: clarify tarball installation [#959](https://github.com/apache/datafusion-comet/pull/959) (comphead)

## Credits

Thank you to everyone who contributed to this release. Here is a breakdown of commits (PRs merged) per contributor.

```
22 Andy Grove
18 Liang-Chi Hsieh
3 Adam Binford
3 Matt Butrovich
2 Kristin Cowalcijk
2 Oleks V
2 Parth Chandra
1 Himadri Pal
1 Huaxin Gao
1 KAZUYUKI TANIMURA
1 Semyon
1 Xin Hao
```

Thank you also to everyone who contributed in other ways such as filing issues, reviewing PRs, and providing feedback on this release.

0 comments on commit 3783faa

Please sign in to comment.