-
Notifications
You must be signed in to change notification settings - Fork 28.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-6191] [EC2] Generalize ability to download libs #4919
Conversation
Test build #28309 has started for PR 4919 at commit
|
Test build #28310 has started for PR 4919 at commit
|
Test build #28311 has started for PR 4919 at commit
|
cc @JoshRosen |
Test build #28309 has finished for PR 4919 at commit
|
Test FAILed. |
Test build #28310 has finished for PR 4919 at commit
|
Test FAILed. |
Test build #28311 has finished for PR 4919 at commit
|
Test FAILed. |
Test build #28322 has started for PR 4919 at commit
|
Test build #28322 has finished for PR 4919 at commit
|
Test PASSed. |
Test build #28343 has started for PR 4919 at commit
|
Test build #28343 has finished for PR 4919 at commit
|
Test PASSed. |
Obviously I'd like to get another actual active EC2 user to review this, but the principle looks fine. this is refactoring the boto-specific mechanism to be general and at the moment does not change behavior. |
Yeah, if @JoshRosen (who wrote the original |
This seems fine to me. I guess the alternatives would be
I think that this is fine for now. As part of our binary release packaging scripts, we could download and include these archives so that only users who build from source will need to perform these downloads. |
Right now we have a method to specifically download boto. This PR generalizes it so it's easy to download additional libraries if we want.
For example, adding new external libraries for spark-ec2 is now as simple as:
Likely use cases:
First run output, with PyYAML and argparse added just for demonstration purposes:
Output thereafter: