-
Notifications
You must be signed in to change notification settings - Fork 914
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[KYUUBI #6437] Fix Spark engine query result save to HDFS #6444
Conversation
@wForget some issues with local testing, and since the changes were minimal, I just submitted the PR. Please review. I'm build a dist to test in our internal cluster, will send result later. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #6444 +/- ##
============================================
+ Coverage 58.26% 58.30% +0.03%
Complexity 24 24
============================================
Files 656 656
Lines 40268 40264 -4
Branches 5498 5498
============================================
+ Hits 23463 23476 +13
+ Misses 14287 14283 -4
+ Partials 2518 2505 -13 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Although this function is currently available, it lacks some testing, so some previous changes may cause regressions.
tested with 1.9.1 + this PR, problem solved. ready to merge |
better performance and memory usage than
|
This pull request fixes #6437 Use `org.apache.hadoop.fs.Path` instead of `java.nio.file.Paths` to avoid `OPERATION_RESULT_SAVE_TO_FILE_DIR` scheme unexpected change. - [x] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) Spark Job failed to start with error: `java.io.IOException: JuiceFS initialized failed for jfs:///` with conf `kyuubi.operation.result.saveToFile.dir=jfs://datalake/tmp`. `hdfs://xxx:port/tmp` may encounter similar errors User Can use hdfs dir as `kyuubi.operation.result.saveToFile.dir` without error. Seems no test suites added in #5591 and #5986, I'll try to build a dist and test with our internal cluster. --- - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) **Be nice. Be informative.** Closes #6444 from camper42/save-to-hdfs. Closes #6437 990f0a7 [camper42] [Kyuubi #6437] Fix Spark engine query result save to HDFS Authored-by: camper42 <camper.xlii@gmail.com> Signed-off-by: Cheng Pan <chengpan@apache.org> (cherry picked from commit 71649da) Signed-off-by: Cheng Pan <chengpan@apache.org>
Thanks, merged to master/1.9 |
🔍 Description
Issue References 🔗
This pull request fixes #6437
Describe Your Solution 🔧
Use
org.apache.hadoop.fs.Path
instead ofjava.nio.file.Paths
to avoidOPERATION_RESULT_SAVE_TO_FILE_DIR
scheme unexpected change.Types of changes 🔖
Test Plan 🧪
Behavior Without This Pull Request ⚰️
Spark Job failed to start with error:
java.io.IOException: JuiceFS initialized failed for jfs:///
with confkyuubi.operation.result.saveToFile.dir=jfs://datalake/tmp
.hdfs://xxx:port/tmp
may encounter similar errorsBehavior With This Pull Request 🎉
User Can use hdfs dir as
kyuubi.operation.result.saveToFile.dir
without error.Related Unit Tests
Seems no test suites added in #5591 and #5986, I'll try to build a dist and test with our internal cluster.
Checklist 📝
Be nice. Be informative.