Skip to content

Commit

Permalink
[Core] Faster optimizer table by disabling reservation check (#3280)
Browse files Browse the repository at this point in the history
* Support h100

* Fix H100 from sku

* Fix H100

* Disable reservation check for GCP by default

* Update docs/source/reference/config.rst

Co-authored-by: Zongheng Yang <zongheng.y@gmail.com>

* Update docs/source/reference/config.rst

Co-authored-by: Zongheng Yang <zongheng.y@gmail.com>

* rename config

* fix

---------

Co-authored-by: Zongheng Yang <zongheng.y@gmail.com>
  • Loading branch information
Michaelvll and concretevitamin authored Mar 7, 2024
1 parent de7a868 commit a7fdfd2
Show file tree
Hide file tree
Showing 4 changed files with 33 additions and 3 deletions.
25 changes: 22 additions & 3 deletions docs/source/reference/config.rst
Original file line number Diff line number Diff line change
Expand Up @@ -190,10 +190,29 @@ Available fields and semantics:
# Reserved capacity (optional).
#
# Whether to prioritize reserved instance types/locations (considered as 0
# cost) in the optimizer.
#
# If you have "automatically consumed" reservations in your GCP project:
# Setting this to true guarantees the optimizer will pick any matching
# reservation and GCP will auto consume your reservation, and setting to
# false means optimizer uses regular, non-zero pricing in optimization (if
# by chance any matching reservation is selected, GCP still auto consumes
# the reservation).
#
# If you have "specifically targeted" reservations (set by the
# `specific_reservations` field below): This field will automatically be set
# to true.
#
# Default: false.
prioritize_reservations: false
#
# The "specifically targeted" reservations to be considered when provisioning
# clusters on GCP. SkyPilot will automatically prioritize this reserved
# capacity (considered as zero cost) if the requested resources matches the
# reservation.
#
# The specific reservation to be considered when provisioning clusters on GCP.
# SkyPilot will automatically prioritize this reserved capacity (considered as
# zero cost) if the requested resources matches the reservation.
# Ref: https://cloud.google.com/compute/docs/instances/reservations-overview#consumption-type
specific_reservations:
- projects/my-project/reservations/my-reservation1
Expand Down
6 changes: 6 additions & 0 deletions sky/clouds/utils/gcp_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -139,6 +139,12 @@ def _list_reservations_for_instance_type(
For example, if we have a specific reservation with n1-highmem-8
in us-central1-c. `sky launch --gpus V100` will fail.
"""
prioritize_reservations = skypilot_config.get_nested(
('gcp', 'prioritize_reservations'), False)
specific_reservations = skypilot_config.get_nested(
('gcp', 'specific_reservations'), [])
if not prioritize_reservations and not specific_reservations:
return []
logger.debug(f'Querying GCP reservations for instance {instance_type!r}')
list_reservations_cmd = (
'gcloud compute reservations list '
Expand Down
3 changes: 3 additions & 0 deletions sky/utils/schemas.py
Original file line number Diff line number Diff line change
Expand Up @@ -579,6 +579,9 @@ def get_config_schema():
'required': [],
'additionalProperties': False,
'properties': {
'prioritize_reservations': {
'type': 'boolean',
},
'specific_reservations': {
'type': 'array',
'items': {
Expand Down
2 changes: 2 additions & 0 deletions sky/utils/subprocess_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -167,6 +167,8 @@ def run_with_retries(
if retry_cnt < max_retry:
if (retry_returncode is not None and
returncode in retry_returncode):
logger.debug(
f'Retrying command due to returncode {returncode}: {cmd}')
retry_cnt += 1
time.sleep(random.uniform(0, 1) * 2)
continue
Expand Down

0 comments on commit a7fdfd2

Please sign in to comment.