-
Notifications
You must be signed in to change notification settings - Fork 28.6k
[SPARK-4236] Cleanup removed applications' files in shuffle service #3126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -95,6 +111,56 @@ public ManagedBuffer getBlockData(String appId, String execId, String blockId) { | |||
} | |||
|
|||
/** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The following 32 lines are the only material change in this entire PR. The rest is utility functions imported from core Utils, Java cruft around lambdas and classes, and updating/adding tests.
Test build #22972 has started for PR 3126 at commit
|
Test build #22973 has started for PR 3126 at commit
|
@andrewor14 @rxin PTAL. For the YARN auxiliary service, I think it is sufficient just to just call |
Test build #22972 has finished for PR 3126 at commit
|
Test PASSed. |
Test build #22973 has finished for PR 3126 at commit
|
Test PASSed. |
Hey was this already merged? I don't see it showing up on ASF. |
Whoops, accidentally deleted it locally. |
Test build #23015 has started for PR 3126 at commit
|
rpcHandler.clearRegisteredExecutors() | ||
// Invalidate the registered executors, disallowing access to their shuffle blocks (without | ||
// deleting the actual shuffle files, so we could access them without the shuffle service). | ||
rpcHandler.applicationRemoved(sc.conf.getAppId, false /* cleanupLocalDirs */) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you just do cleanupLocalDirs = false
? Or can we not because this is a Java method even though we're using it in Scala
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, not possible if the method is written in Java, unfortunately.
Test build #23015 has finished for PR 3126 at commit
|
Test PASSed. |
Comments addressed. |
Test build #23026 has started for PR 3126 at commit
|
Test build #23026 has finished for PR 3126 at commit
|
Test PASSed. |
LGTM. Hey by the way is there anything we need to change about the way Also I think this conflicts with your other patch which I just merged. |
This relies on a hook from whoever is hosting the shuffle service to invoke removeApplication() when the application is completed. Once invoked, we will clean up all the executors' shuffle directories we know about.
It was you -- JavaUtils changed. Updated. |
Test build #23032 has started for PR 3126 at commit
|
Weird, we both only added code to JavaUtils. I would think that it doesn't need to conflict there. |
Test build #23032 has finished for PR 3126 at commit
|
Test PASSed. |
Ok merging into master and 1.2 |
This relies on a hook from whoever is hosting the shuffle service to invoke removeApplication() when the application is completed. Once invoked, we will clean up all the executors' shuffle directories we know about. Author: Aaron Davidson <aaron@databricks.com> Closes #3126 from aarondav/cleanup and squashes the following commits: 33a64a9 [Aaron Davidson] Missing brace e6e428f [Aaron Davidson] Address comments 16a0d27 [Aaron Davidson] Cleanup e4df3e7 [Aaron Davidson] [SPARK-4236] Cleanup removed applications' files in shuffle service (cherry picked from commit 48a19a6) Signed-off-by: Andrew Or <andrew@databricks.com>
This relies on a hook from whoever is hosting the shuffle service to invoke removeApplication() when the application is completed. Once invoked, we will clean up all the executors' shuffle directories we know about.