An example of setting up remote cloud environment for training a machine learning model using Terraform Provider Iterative.
See the blog post at https://dvc.org/blog/local-experiments-to-cloud-with-tpi for a full explanation.
The main.tf file contains two different tasks:
- basic scenario for running a script remotely, and
- training a model on a GPU device
To run this tutorial, make sure to have a cloud account (AWS, Azure, GCP, or K8s) with authentication credentials stored as environment variables.
- Install Terraform
terraform init: setup dependenciesterraform apply: provision cloud infrastructure & upload taskterraform refresh && terraform show: check statusterraform destroy: download results and terminate cloud infrastructure
