-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Docs: Add Daft into Iceberg documentation #9836
Conversation
jaychia
commented
Feb 29, 2024
- Adds installation examples
- Adds code examples for getting up and running with Daft + PyIceberg
- Adds a type conversion matrix between Daft and PyIceberg
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry this is a lot but had a lot of thoughts the. Got stuck.
Co-authored-by: Brian "bits" Olsen <bits@bitsondata.dev>
Co-authored-by: Brian "bits" Olsen <bits@bitsondata.dev>
Co-authored-by: Brian "bits" Olsen <bits@bitsondata.dev>
Co-authored-by: Brian "bits" Olsen <bits@bitsondata.dev>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that this mainly targets pyiceberg, I think this doc should live in https://github.com/apache/iceberg-python/
Hello! The Daft query engine integrates with Daft is a fully featured distributed query engine, and we are actively working on non-PyIceberg specific functionality that is more applicable to the wider Iceberg ecosystem (e.g. partitioned writes, compaction stored procedures, orphan file pruning procedures etc). This is in contrast to pyiceberg-only integrations such as Pandas/Arrow which really just use pyiceberg for retrieving data into Python memory. |
@nastra, I tend to agree with @jaychia on this one. I don't want to split up the documentation any more than necessary. Any compute engine that runs on Iceberg, I want to I see this eventually looking like Trino's data sources or Kafka Connectors. We've discussed this a bit before here: #9681 I think this future reorder will include engines based in languages outside of Java. |
- limitations under the License. | ||
--> | ||
|
||
# Daft |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so it seems the site can't be actually built when serving the docs locally
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was using https://github.com/apache/iceberg/blob/aff5b39a7dddd22790b6ba47f514860c53e33c00/site/README.md to locally serve the site. @bitsondatadev can you double-check please if the site properly renders for you when running ./dev/serve.sh
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I used to have a "nightly" build in there, but we took it out initially to avoid confusion. I think part of the build can just be to add "local" or something. Currently, the build just grabs the latest semantic version and points latest there, we could just do the same and point /site/docs/docs/local >> /docs and maybe expose another build option to enable that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bitsondatadev is this PR here ready to go in?
Co-authored-by: Eduard Tudenhoefner <etudenhoefner@gmail.com>
Co-authored-by: Eduard Tudenhoefner <etudenhoefner@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nastra Ship it!