GCP compute instances are not shutdown after idle timeout 

First off, thanks for adding GCP support to cml-runner.

While playing around with it, I noticed that my compute engine instances weren't being shutdown/terminated (i.e., VM powered off) or deleted. I experimented with `--idle-timeout` and `--single`, yet neither made a difference. The instance stays alive. This is unexpected given [the following sentence from the docs](https://cml.dev/doc/self-hosted-runners#:~:text=After%20the%20job%20runs%2C%20the%20instance%20automatically%20shuts%20down.) on self-hosted runners:

> After the job runs, the instance automatically shuts down.

And indeed, that's what I've observed with AWS instances. Those seem to terminate correctly.

Here's my Github workflow for extra context:

```
name: 'Train-in-the-cloud-GCP'
on: 
  workflow_dispatch:

jobs:
  deploy-runner:
    runs-on: [ubuntu-latest]
    steps:
      - uses: iterative/setup-cml@v1
      - uses: actions/checkout@v2
      - name: 'Deploy runner on GCP'
        shell: bash
        env:
          REPO_TOKEN: ${{ secrets.PERSONAL_ACCESS_TOKEN }}
          # Notice use of `GOOGLE_APPLICATION_CREDENTIALS_DATA` instead of
          # `GOOGLE_APPLICATION_CREDENTIALS`. Contrary to what docs suggest, the
          # latter causes problems for terraform.
          GOOGLE_APPLICATION_CREDENTIALS_DATA: ${{ secrets.GOOGLE_APPLICATION_CREDENTIALS }}
        run: |
          cml-runner \
          --cloud gcp \
          --cloud-region europe-west1-b	 \
          --cloud-type=n1-standard-1 \
          --labels=cml-runner
          
  model-training:
    needs: deploy-runner
    runs-on: [self-hosted, cml-runner]
    container: docker://dvcorg/cml-py3:latest
    steps:
      - uses: actions/checkout@v2
      - name: 'Train my dummy model'
        env:
          REPO_TOKEN: ${{ secrets.PERSONAL_ACCESS_TOKEN }}
        run: |
          echo "Training a super awesome model"
          sleep 5
          echo "Training complete"
```

Hope there's a way to ensure the same auto-shutdown behavior on GCP. As it is, the risk of getting smacked with an expensive bill for an idle GPU is just too real =p 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

GCP compute instances are not shutdown after idle timeout #661

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

GCP compute instances are not shutdown after idle timeout #661

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions