-
Notifications
You must be signed in to change notification settings - Fork 28.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-6193] [EC2] Push group filter up to EC2 #4922
Conversation
Test build #28321 has started for PR 4922 at commit
|
cc @shivaram |
Test build #28321 has finished for PR 4922 at commit
|
Test PASSed. |
reservations = conn.get_all_reservations( | ||
filters={"instance.group-name": group_names}) | ||
instances = itertools.chain.from_iterable(r.instances for r in reservations) | ||
return [i for i in instances if i.state != "terminated"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comment: this seems a little different from the old check we had. We were originally checking if it was one of 'pending', 'running', 'stopping', 'stopped'
while right now we check if its not terminated
. There are other states as shown in http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-lifecycle.html but I don't think it should matter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The doc referenced in the comment notes these instance states:
0 : pending
16 : running
32 : shutting-down
48 : terminated
64 : stopping
80 : stopped
So it looks like the only state we're missing is shutting-down
, which is the intermediate state right before terminated
. I can add that in to be consistent with the previous behavior.
Alternately, we can leave it as-is and re-terminate instances even if they are shutting-down
. You know, zombies and stuff. 👹 🔫
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it doesn't look like shutting-down
is a major issue, but it might just be safer to keep existing behavior
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, fixified.
This is a nice change @nchammas -- Code looks good to me, but I'd just like to try it out once on my machine. |
Just tried this out and it works fine. LGTM pending the minor inline comment |
Test build #28353 has started for PR 4922 at commit
|
Test build #28353 has finished for PR 4922 at commit
|
Test PASSed. |
Inline comment addressed. |
Sounds like a good improvement, changes look OK to my mildly informed eyes, and you have both reviewed and tested the change. LGTM. |
When looking for a cluster, spark-ec2 currently pulls down info for all instances and filters locally. When working on an AWS account with hundreds of active instances, this step alone can take over 10 seconds.
This PR improves how spark-ec2 searches for clusters by pushing the filter up to EC2.
Basically, the problem (and solution) look like this:
Translated to a user-visible action, this looks like (against an AWS account with ~200 active instances):
This PR also refactors
get_existing_cluster()
to make it, I hope, simpler.Finally, this PR fixes some minor grammar issues related to printing status to the user. 🎩 👏