-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Example: FFI Table Provider as dynamic module loading #13183
Conversation
I'll rebase as soon as #12920 merges into |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pardon for the too early review. I got curious and thought some comments would be more helpful than distracting. It's great!
datafusion/ffi/README.md
Outdated
# `datafusion-ffi`: Apache DataFusion Foreign Function Interface | ||
|
||
This crate contains code to allow interoperability of Apache [DataFusion] | ||
with functions from other languages using a stable interface. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we already define stable
more precisely? Given frequent major releases of datafusion
, a user may assume they can't load an FFI implementation compiled with 42
into a binary of version 43
.
OTOH, are there maybe subtle limitations regarding tokio runtime or something alike? (didn't research this topic TBH)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can run different versions! I've tested this in datafusion-python
by backporting these changes into DF41 and I ran datafusion-python
with DF42 and delta-rs
with DF41. I'll work on the text.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A small explanation on where and how FFI table scans are scheduled in terms of rt threads would be helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not quite sure what you're looking for here or who the audience would be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As someone loading plugins, I want to know if plugins would generally block tokio threads.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added some text, but it's very brief. Basically, async calls through this interface operate exactly like their pure rust counterparts provided the library on the other side implements their code properly.
67fc22b
to
19b4a64
Compare
❤️ |
Thank you @ahirner for the review. I think this is ready for a full review now. If you have any outstanding questions or I didn't write it up to your satisfaction, please let me know! |
@ahirner would you mind re-reviewing? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All points adressed. I also built and run the example on darwin x86.
Excellent quality IMO, which will help adoptiong this solution.
datafusion-examples/examples/ffi/ffi_example_table_provider/Cargo.toml
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @timsaucer
I also tried out the instructions in the README and it seemed to work great
(venv) andrewlamb@Andrews-MacBook-Pro-2:~/Software/datafusion/datafusion-examples/examples/ffi/ffi_module_loader$ cargo run
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.27s
Running `/Users/andrewlamb/Software/datafusion/target/debug/ffi_module_loader`
+----+------+
| a | b |
+----+------+
| 1 | 1.0 |
| 2 | 2.0 |
| 3 | 3.0 |
| 4 | 4.0 |
| 5 | 5.0 |
| 6 | 6.0 |
| 7 | 7.0 |
| 8 | 8.0 |
| 9 | 9.0 |
| 10 | 10.0 |
| 11 | 11.0 |
+----+------+
functions from other libraries and/or [DataFusion] versions using a stable | ||
interface. | ||
|
||
One of the limitations of the Rust programming language is that there is no |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
|
||
#[repr(C)] | ||
#[derive(StableAbi)] | ||
#[sabi(kind(Prefix(prefix_ref = TableProviderModuleRef)))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't fully understand the need for this interface module.
Couldn't the caller program simple use construct_simple_table_provider
directly? (It would have to know the name of the entry point of course which is less general than what you have here)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't actually need it, but it does give a separation of concerns where you have one crate for the interface so that any module implementers don't need to executable as a dependency. I'm mostly following the same approach as the examples in the abi_stable
crate but it's also a model I've used in other places.
…rgo.toml Co-authored-by: Alexander Hirner <6055037+ahirner@users.noreply.github.com>
Which issue does this PR close?
Closes #13175
Rationale for this change
PR #12920 introduces FFI bindings for TableProvider but did not show an end to end example of using these. This PR improves the documentation for how to use the
datafusion-ffi
crate and a complete example of building one library that exposes a table provider and loading it from a different binary.What changes are included in this PR?
Adds three crates to show interface, implementation, and using a library table provider.
Are these changes tested?
These are an example, not any change to core functionality.
Are there any user-facing changes?
None.