Skip to content

Add datafusion-substrait crate #4543

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 38 commits into from
Jan 12, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
b717d4f
Initial commit
andygrove Mar 5, 2022
661af4b
initial commit
andygrove Mar 5, 2022
d91fa00
failing test
andygrove Mar 5, 2022
b17155a
table scan projection
andygrove Mar 5, 2022
0f1b7e3
closer
andygrove Mar 5, 2022
8dbc69c
test passes, with some hacks
andygrove Mar 5, 2022
76f95b1
Merge pull request #1 from andygrove/roundtrip
andygrove Mar 5, 2022
d2b18c8
use DataFrame (#2)
andygrove Mar 5, 2022
09a0db6
update README
andygrove Mar 5, 2022
16ccb02
update dependency
andygrove Mar 6, 2022
9aac089
code cleanup (#3)
andygrove Mar 6, 2022
a865900
Add support for Filter operator and BinaryOp expressions (#4)
andygrove Mar 7, 2022
db8d439
GitHub action (#5)
andygrove Mar 7, 2022
61e4cc3
Split code into producer and consumer modules (#6)
andygrove Mar 9, 2022
03745da
Support more functions and scalar types (#7)
Dandandan Mar 27, 2022
a75476a
Use substrait 0.1 and datafusion 8.0 (#8)
andygrove Jun 13, 2022
2644b81
update datafusion to 10.0 and substrait to 0.2 (#11)
JanKaul Aug 5, 2022
3349464
Add basic join support (#12)
andygrove Aug 15, 2022
9e28e1a
Added fetch support (#23)
nseekhao Oct 13, 2022
161b774
Upgrade to DataFusion 13.0.0 (#25)
andygrove Oct 14, 2022
c8c8732
Add sort consumer and producer (#24)
nseekhao Oct 14, 2022
e1b9569
Add serializer/deserializer (#26)
nseekhao Oct 21, 2022
09b2102
Add plan and function extension support (#27)
nseekhao Oct 24, 2022
f41f6dc
Implement GROUP BY (#28)
nseekhao Nov 14, 2022
bfa3c0c
Changed field reference from mask to direct reference (#29)
nseekhao Nov 17, 2022
3f892d4
Handle SubqueryAlias (#30)
nseekhao Nov 17, 2022
dbd315c
Add support for SELECT DISTINCT (#31)
nseekhao Nov 28, 2022
637a2f3
Implement BETWEEN (#32)
nseekhao Nov 30, 2022
2b87bb9
Add case (#33)
nseekhao Dec 7, 2022
e7fa58e
feat: support explicit catalog/schema names in ReadRel (#34)
waynexia Dec 7, 2022
aa9cfb1
move files to subfolder
andygrove Dec 7, 2022
27a9dcc
fix merge conflict
andygrove Dec 7, 2022
76c2e19
RAT
andygrove Dec 7, 2022
60466ce
remove rust.yaml
andygrove Dec 7, 2022
24cfd8f
revert .gitignore changes
andygrove Dec 7, 2022
474e2d5
tomlfmt
andygrove Dec 7, 2022
4ba586d
tomlfmt
andygrove Dec 7, 2022
2d99a0a
Merge remote-tracking branch 'apache/master' into substrait
andygrove Jan 11, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 32 additions & 0 deletions datafusion/substrait/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

[package]
name = "datafusion-substrait"
version = "0.1.0"
edition = "2021"

[dependencies]
async-recursion = "1.0"
datafusion = "13.0"
prost = "0.9"
prost-types = "0.9"
substrait = "0.2"
tokio = "1.17"

[build-dependencies]
prost-build = { version = "0.9" }
34 changes: 34 additions & 0 deletions datafusion/substrait/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
<!---
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->

# DataFusion + Substrait

[Substrait](https://substrait.io/) provides a cross-language serialization format for relational algebra, based on
protocol buffers.

This repository provides a Substrait producer and consumer for DataFusion:

- The producer converts a DataFusion logical plan into a Substrait protobuf.
- The consumer converts a Substrait protobuf into a DataFusion logical plan.

Potential uses of this crate:

- Replace the current [DataFusion protobuf definition](https://github.com/apache/arrow-datafusion/blob/master/datafusion-proto/proto/datafusion.proto) used in Ballista for passing query plan fragments to executors
- Make it easier to pass query plans over FFI boundaries, such as from Python to Rust
- Allow Apache Calcite query plans to be executed in DataFusion
Loading