Use this Terraform configuration as a starting point to provision AWS infrastructure for miniwdl-aws -- including a VPC, EFS file system, Batch queues, and IAM roles.
Before diving into this, please review simpler ways to use miniwdl-aws described there.
- AWS account with administrator/poweruser access
- git, terraform
- terminal session with AWS CLI configured (credentials and default region)
- miniwdl-aws (
pip3 install miniwdl-aws
)
git clone https://github.com/miniwdl-ext/miniwdl-aws-terraform.git
cd miniwdl-aws-terraform
terraform init
terraform apply \
-var='environment_tag=miniwdl' \
-var='owner_tag=me@example.com' \
-var='s3upload_buckets=["MY-BUCKET"]'
where
environment_tag
fills the Environment tag of each resource, and prefixes some resource names (for identification & deconfliction)owner_tag
fills the Owner tag of each resource (typically your username/email)s3upload_buckets
is a list of S3 bucket names where you may ask miniwdl-aws to upload workflow outputs (optional)
If you get an error about service roles already existing, add -var=create_spot_service_roles=false
(these roles need to be created only once per account).
The deployment outputs the name of the Batch job queue for submitting workflows, miniwdl-workflow
(${environment_tag}-workflow
), which you can supply to the miniwdl-aws self-test:
miniwdl-aws-submit --self-test --follow --workflow-queue miniwdl-workflow
Or, set the environment variable MINIWDL__AWS__WORKFLOW_QUEUE=miniwdl-workflow
instead of the --workflow-queue
command-line option.
See miniwdl-aws for how to use miniwdl-aws-submit
further.
The following Terraform variables are also available:
task_max_vpcus=256
maximum vCPUs for the Batch compute environment used for WDL task executionworkflow_max_vcpus=16
maximum vCPUs for the Batch compute environment used for miniwdl engine processes (limits maximum # of workflows running concurrently)enable_task_fallback=true
enable fallback to EC2 On Demand compute environment after a task experiencesruntime.preemptible
spot interruptions (the on-demand environment & queue are always created, but not used by default)
Review the network configuration and IAM policies in main.tf. To keep the configuration succinct, we wrote in simple networking with public subnets, and existing IAM policies that are more powerful than strictly needed. Customize as needed for your security requirements.
You'll need a way to browse & manage the remote EFS contents. The companion lambdash-efs is one option; the Terrafom deployment outputs the infrastructure details needed to deploy it (pick any subnet). Or, set up an instance/container mounting the EFS, to access via SSH or web app (e.g. JupyterHub, Cloud Commander, VS Code server).
Using FSx for Lustre instead of EFS
See fsx/ for a variant of the infrastructure configuration deploying Lustre instead of EFS. The default EFS stack is recommended to begin with, but large-scale operations may need the higher throughput offered by Lustre.