Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[autoscaler] GCP node provider #2061

Merged
merged 73 commits into from
May 31, 2018
Merged
Changes from 1 commit
Commits
Show all changes
73 commits
Select commit Hold shift + click to select a range
a896450
Google Cloud Platform scaffolding
hartikainen May 15, 2018
a3b44df
Add minimal gcp config example
hartikainen May 15, 2018
281c7b6
Add googleapiclient discoveries, update gcp.config constants
hartikainen May 15, 2018
8cd3287
Rename and update gcp.config key pair name function
hartikainen May 15, 2018
8615abb
Implement gcp.config._configure_project
hartikainen May 15, 2018
f0539d3
Fix the create project get project flow
hartikainen May 15, 2018
ba8cdbf
Implement gcp.config._configure_iam_role
hartikainen May 15, 2018
012c5d8
Implement service account iam binding
hartikainen May 15, 2018
f88449b
Implement gcp.config._configure_key_pair
hartikainen May 15, 2018
08f53a4
Implement rsa key pair generation
hartikainen May 15, 2018
d17e244
Implement gcp.config._configure_subnet
hartikainen May 15, 2018
3e67a60
Save work-in-progress gcp.config._configure_firewall_rules.
hartikainen May 15, 2018
d05df31
Remove unnecessary firewall configuration
hartikainen May 15, 2018
b499ea0
Update example-minimal.yaml configuration
hartikainen May 15, 2018
5deb6ba
Add new wait_for_compute_operation, rename old wait_for_operation
hartikainen May 15, 2018
352a4ff
Temporarily rename autoscaler tags due to gcp incompatibility
hartikainen May 15, 2018
16a8605
Implement initial gcp.node_provider.nodes
hartikainen May 15, 2018
944468f
Implement initial gcp.node_provider.create_node
hartikainen May 15, 2018
9a5c8a3
Implement initial gcp.node_provider._node and node status functions
hartikainen May 15, 2018
6c78b40
Implement initial gcp.node_provider.terminate_node
hartikainen May 15, 2018
bd09fbc
Implement node tagging and ip getter methods for nodes
hartikainen May 15, 2018
52def7b
Temporarily rename tags due to gcp incompatibility
hartikainen May 15, 2018
ad301bb
Tiny tweaks for autoscaler.updater
hartikainen May 15, 2018
9a1d052
Remove unused config from gcp node_provider
hartikainen May 15, 2018
423f791
Add new example-full example to gcp, update load_gcp_example_config
hartikainen May 15, 2018
5d90308
Implement label filtering for gcp.node_provider.nodes
hartikainen May 15, 2018
f5634d3
Revert unnecessary change in ssh command
hartikainen May 15, 2018
7e1ea09
Revert "Temporarily rename tags due to gcp incompatibility"
hartikainen May 16, 2018
d71d4e4
Revert "Temporarily rename autoscaler tags due to gcp incompatibility"
hartikainen May 16, 2018
9cf4840
Refactor autoscaler tagging to support multiple tag specs
hartikainen May 16, 2018
7037922
Remove missing cryptography imports
hartikainen May 16, 2018
d9bea64
Update quote function import
hartikainen May 18, 2018
bd07ff1
Fix threading issue in gcp.config with the compute discovery object
hartikainen May 18, 2018
e6559fd
Add gcs support for log_sync
hartikainen May 19, 2018
de2980b
Fix the labels/tags naming discrepancy
hartikainen May 19, 2018
a221c0d
Add expanduser to file_mounts hashing
hartikainen May 19, 2018
9c3899c
Fix gcp.node_provider.internal_ip
hartikainen May 19, 2018
c3341b0
Add uuid to node name
hartikainen May 20, 2018
5fa034c
Remove 'set -i' from updater ssh command
hartikainen May 20, 2018
c152faa
Update ssh key creation in autoscaler.gcp.config
hartikainen May 20, 2018
5948f0f
Fix wait_for_compute_zone_operation's threading issue
hartikainen May 20, 2018
f40d7ef
Address pr feedback from @ericl
hartikainen May 20, 2018
44a3458
Expand local file mount paths in NodeUpdater
hartikainen May 20, 2018
646dc81
Add ssh_user name to key names
hartikainen May 21, 2018
23a0066
Update updater ssh to attempt 'set -i' and fall back if that fails
hartikainen May 21, 2018
e50dd00
Update gcp/example-full.yaml
hartikainen May 21, 2018
85c1e4b
Fix wait crm operation in gcp.config
hartikainen May 21, 2018
5517743
Update gcp/example-minimal.yaml to match aws/example-minimal.yaml
hartikainen May 21, 2018
35d946e
Fix gcp/example-full.yaml comment indentation
hartikainen May 21, 2018
dd8fc5f
Add gcp/example-full.yaml to setup files
hartikainen May 22, 2018
693de75
Update example-full.yaml command
hartikainen May 24, 2018
183453e
Revert "Refactor autoscaler tagging to support multiple tag specs"
hartikainen May 24, 2018
46250d3
Update tag spec to only use characters [0-9a-z_-]
hartikainen May 24, 2018
7a84bbd
Change the tag values to conform gcp spec
hartikainen May 24, 2018
3e7f91e
Add project_id in the ssh key name
hartikainen May 24, 2018
b81ab5a
Replace '_' with '-' in autoscaler tag names
hartikainen May 26, 2018
9f87340
Revert "Update updater ssh to attempt 'set -i' and fall back if that …
hartikainen May 26, 2018
a791241
Revert "Remove 'set -i' from updater ssh command"
hartikainen May 26, 2018
1489971
Add fallback to `set -i` in force_interactive command
hartikainen May 26, 2018
ae5a586
Update autoscaler tests to match current implementation
hartikainen May 26, 2018
8e37ab4
Update GCPNodeProvider.create_node to include hash in instance name
hartikainen May 26, 2018
79c0c19
Add support for creating multiple instance on one create_node call
hartikainen May 26, 2018
34a403a
Clean TODOs
hartikainen May 27, 2018
ccbe2aa
Update styles
hartikainen May 27, 2018
5d78ef5
Remove unnecessary comment. Fix indentation.
hartikainen May 27, 2018
d8f66b3
Merge branch 'master' into feature/gcp-node-provider
hartikainen May 28, 2018
6856117
Yapfify files that fail flake8 test
hartikainen May 29, 2018
41e90ed
Yapfify more files
hartikainen May 30, 2018
a59f81c
Update project_id handling in gcp node provider
hartikainen May 30, 2018
feeb3a8
Merge branch 'master' into hartikainen-feature/gcp-node-provider
richardliaw May 30, 2018
b6744e4
temporary yapf mod
richardliaw May 30, 2018
dd6b5ab
Revert "temporary yapf mod"
hartikainen May 31, 2018
940c1b1
Fix autoscaler/updater.py lint error, remove unused variable
hartikainen May 31, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Implement initial gcp.node_provider.nodes
* Still missing filter support
  • Loading branch information
hartikainen committed May 19, 2018
commit 16a860516a61f64c4fad8627a19b2c1bc734e161
24 changes: 17 additions & 7 deletions python/ray/autoscaler/gcp/node_provider.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,8 @@
from __future__ import division
from __future__ import print_function

# TODO.gcp: import google cloud sdks
from googleapiclient import discovery
compute = discovery.build('compute', 'v1')

from ray.autoscaler.node_provider import NodeProvider
from ray.autoscaler.tags import TAG_RAY_CLUSTER_NAME
Expand All @@ -22,12 +23,21 @@ def __init__(self, provider_config, cluster_name):
self.internal_ip_cache = {}
self.external_ip_cache = {}

def nodes(self, tag_filters):
"""Return list of nodes matching general ray filter and tag_filters"""
raise NotImplementedError('GCPNodeProvider.nodes')
instances = None
self.cached_nodes = {i.id: i for i in instances}
return [i.id for i in instances]
def nodes(self, label_filters):
# TODO: Add filters
filter_expr = ''

response = compute.instances().list(
project=self.provider_config['project_id'],
zone=self.provider_config['availability_zone'],
filter=filter_expr,
).execute()

instances = response.get('items', [])
# Note: All the operations use 'name' as the unique instance identifier
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this is guaranteed to be unique per node?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. In fact, I had to change the create_node arguments to pass unique name to the nodes because gcp requires the names to be unique.

self.cached_nodes = {i['name']: i for i in instances}

return [i['name'] for i in instances]

def is_running(self, node_id):
node = self._node(node_id)
Expand Down