-
Notifications
You must be signed in to change notification settings - Fork 126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds support to read and write to Databricks delta tables #630
Conversation
Pull requests from external contributors require approval from a |
/ok to test |
/ok to test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestions are inline. Biggest issue is that there are no tests that exercise these new stages. To guarantee that they will continue to work, we need tests built that run these stages.
will take a look at the changes |
@pthalasta Pinging here to resurface. This would be a good stage to add once issues are addressed. |
@mdemoret-nv i can resolve the comments except the usage of JDK as it is a requirement and regardless of which source we install it from there will be increase in container size. As this stage reads data from pyspark via databricks-connect, i don't think it will work without JDK. There are early stage python libs to read from delta without any involvement of databricks/spark, if JDK alternative is a must, i can probably test with early phase python libs, please let me know. |
@cwharris @pthalasta @mdemoret-nv : what are next steps with this MR? Are any discussions happening offline to find a way to resolve the JDK issue and close/merge this MR? |
@gurpreets-nvidia we are indeed discussions options. It looks like adding the JRE, JDK, and Databricks Connect will introduce an additional ~300MB to the release container. We're currently in the process of determining what CVE's would be introduced. Once we understand the CVE implications we can determine whether including it is the right way to go or if we need to look for a workaround. |
@cwharris i see two options
Let me know how to proceed in closing on this MR |
/ok to test |
/ok to test |
/ok to test |
/ok to test |
/ok to test |
/ok to test |
/ok to test |
/ok to test |
/ok to test |
@cwharris i think we have all tests and checks passing. There was no change to documentation part, so not sure if that is needed? |
@dagardner-nv i see that the documentation is failing due to import of stages by sphinx. Is this expected? |
/ok to test |
/ok to test |
@mdemoret-nv @cwharris can we merge if there are no other changes required and all looks good? |
/merge |
Adds support for querying data from Databricks delta tables and write the results to delta tables. Resolves #611