Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 6 additions & 2 deletions .github/workflows/dev-build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,9 @@ env:
REPOSITORY: "timdex-embeddings-dev"

jobs:
choose-runners:
choose-runners:
# This line adds a check for the user which is requesting the PR. As long as its not dependabot, we go ahead and run it.
if: ${{ github.triggering_actor != 'dependabot[bot]' }}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the catch and add here!

name: Determine Runner architecture from .aws-architecture file
runs-on: ubuntu-latest
outputs:
Expand All @@ -43,7 +45,9 @@ jobs:
echo "gpu_arch=$GPU_ARCH" >> $GITHUB_OUTPUT
echo "cpu_arch=$CPU_ARCH" >> $GITHUB_OUTPUT

build:
build:
# This line adds a check for the user which is requesting the PR. As long as its not dependabot, we go ahead and run it.
if: ${{ github.triggering_actor != 'dependabot[bot]' }}
name: Build ${{ matrix.variant }} (${{ matrix.arch }})
needs: choose-runners
strategy:
Expand Down
149 changes: 149 additions & 0 deletions .github/workflows/prod-deploy.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,149 @@
### A special prod-promote workflow for this special repository that needs
### parallel builds for both GPU-enabled and non-GPU-enabled containers.
name: Prod Container Promote
on:
workflow_dispatch:
release:
types: [published]

permissions:
id-token: write
contents: read

env:
AWS_REGION: "us-east-1"
GHA_ROLE_STAGE: "timdex-embeddings-gha-stage"
GHA_ROLE_PROD: "timdex-embeddings-gha-prod"
ECR_STAGE: "timdex-embeddings-stage"
ECR_PROD: "timdex-embeddings-prod"

# Set defaults
defaults:
run:
shell: bash

jobs:
prep:
if: github.ref == 'refs/heads/main'
name: Set architectures from .aws-architecture file
runs-on: ubuntu-latest
outputs:
gpu_arch: ${{ steps.out.outputs.gpu_arch }}
cpu_arch: ${{ steps.out.outputs.cpu_arch }}
steps:
- uses: actions/checkout@v5

- id: out
run: |
GPU_ARCH=$(jq -r '.gpu // "linux/amd64"' .aws-architecture)
CPU_ARCH=$(jq -r '.cpu // "linux/amd64"' .aws-architecture)
echo "gpu_arch=$GPU_ARCH" >> $GITHUB_OUTPUT
echo "cpu_arch=$CPU_ARCH" >> $GITHUB_OUTPUT

promote:
name: Promote ${{ matrix.variant }} (for ${{ matrix.arch }}))
needs: prep
strategy:
fail-fast: false
matrix:
include:
- variant: gpu
arch: ${{ needs.prep.outputs.gpu_arch }}
- variant: cpu
arch: ${{ needs.prep.outputs.cpu_arch }}
runs-on: ubuntu-latest

steps:
- name: Checkout code
uses: actions/checkout@v5

- name: Configure Stage AWS credentials
id: login-aws-stage
uses: aws-actions/configure-aws-credentials@v5
with:
aws-region: ${{ env.AWS_REGION }}
role-to-assume: arn:aws:iam::${{ secrets.AWS_ACCT_STAGE }}:role/${{ env.GHA_ROLE_STAGE }}

- name: Check Stage ${{ matrix.variant }} SHA
id: check_sha
run: |
if aws ecr describe-images \
--repository-name ${{ env.ECR_STAGE}} \
--image-ids imageTag=latest-${{ matrix.arch }}-${{ matrix.variant }} \
--query 'imageDetails[].imageTags[]' \
--output text | grep -q ${GITHUB_SHA::8}
then
echo "sha_match=true" >> $GITHUB_OUTPUT
echo "tag_sha=${GITHUB_SHA::8}-${{ matrix.arch }}-${{ matrix.variant }}" >> $GITHUB_OUTPUT
echo "tag_release=${{ github.event.release.tag_name }}-${{ matrix.arch }}-${{ matrix.variant }}" >> $GITHUB_OUTPUT
echo "### SHA Match Success for ${{ matrix.variant }}" >> $GITHUB_STEP_SUMMARY
echo "Continue to promoting the ${{ matrix.variant }} Stage containers to Production." >> $GITHUB_STEP_SUMMARY
else
echo "sha_match=false" >> $GITHUB_OUTPUT
echo "### SHA Match Failure for ${{ matrix.variant }}" >> $GITHUB_STEP_SUMMARY
echo "FAILURE: Stage-Workloads ${{ matrix.variant }} SHA did not match the main branch." >> $GITHUB_STEP_SUMMARY
exit 1
fi

- name: Login to Stage Amazon ECR
id: login-ecr-stage
uses: aws-actions/amazon-ecr-login@v2

- name: Download latest-${{ matrix.arch }}-${{ matrix.variant }} from Stage
id: stage-download
if: ${{ steps.check_sha.outputs.sha_match == 'true' }}
env:
REGISTRY: ${{ steps.login-ecr-stage.outputs.registry }}
REPOSITORY: ${{ env.ECR_STAGE }}
TAG_SHA: ${{ steps.check_sha.outputs.tag_sha }}
run: |
docker pull ${{ env.REGISTRY }}/${{ env.REPOSITORY }}:${{ env.TAG_SHA }}

- name: Configure Prod AWS credentials
if: ${{ steps.check_sha.outputs.sha_match == 'true' }}
id: login-aws-prod
uses: aws-actions/configure-aws-credentials@v5
with:
aws-region: ${{ env.AWS_REGION }}
role-to-assume: arn:aws:iam::${{ secrets.AWS_ACCT_PROD }}:role/${{ env.GHA_ROLE_PROD }}

- name: Login to Prod Amazon ECR
if: ${{ steps.check_sha.outputs.sha_match == 'true' }}
id: login-ecr-prod
uses: aws-actions/amazon-ecr-login@v2

- name: Re-tag and push to Prod
id: prod-push
if: ${{ steps.check_sha.outputs.sha_match == 'true' }}
env:
REGISTRY_STAGE: ${{ steps.login-ecr-stage.outputs.registry }}
REGISTRY_PROD: ${{ steps.login-ecr-prod.outputs.registry }}
REPOSITORY_STAGE: ${{ env.ECR_STAGE }}
REPOSITORY_PROD: ${{ env.ECR_PROD }}
TAG_SHA: ${{ steps.check_sha.outputs.tag_sha }}
TAG_RELEASE: ${{ steps.check_sha.outputs.tag_release }}
TAG_LATEST: latest-${{ matrix.arch }}-${{ matrix.variant }}
run: |
echo "### :whale: Promote ${{ matrix.variant }} Container to Production" >> $GITHUB_STEP_SUMMARY
docker tag ${{ env.REGISTRY_STAGE }}/${{ env.REPOSITORY_STAGE }}:${{ env.TAG_SHA }} \
${{ env.REGISTRY_PROD }}/${{ env.REPOSITORY_PROD }}:${{ env.TAG_LATEST }}
docker tag ${{ env.REGISTRY_STAGE }}/${{ env.REPOSITORY_STAGE }}:${{ env.TAG_SHA }} \
${{ env.REGISTRY_PROD }}/${{ env.REPOSITORY_PROD }}:${{ env.TAG_SHA }}
docker push ${{ env.REGISTRY_PROD }}/${{ env.REPOSITORY_PROD }}:${{ env.TAG_LATEST }}
docker push ${{ env.REGISTRY_PROD }}/${{ env.REPOSITORY_PROD }}:${{ env.TAG_SHA }}

echo "✅ Promoted ${{ matrix.variant }} container to Prod ECR with the following tags:" >> $GITHUB_STEP_SUMMARY
echo "- \`${{ env.TAG_LATEST }}\`" >> $GITHUB_STEP_SUMMARY
echo "- \`${{ env.TAG_SHA }}\` (GitHub SHA)" >> $GITHUB_STEP_SUMMARY

if [ "$GITHUB_EVENT_NAME" != "workflow_dispatch" ]; then
docker tag ${{ env.REGISTRY_STAGE }}/${{ env.REPOSITORY_STAGE }}:${{ env.TAG_SHA }} \
${{ env.REGISTRY_PROD }}/${{ env.REPOSITORY_PROD }}:${{ env.TAG_RELEASE }}
docker push ${{ env.REGISTRY_PROD }}/${{ env.REPOSITORY_PROD }}:${{ env.TAG_RELEASE }}
echo "- \`${{ env.TAG_RELEASE }}\` (GitHub Release Version)" >> $GITHUB_STEP_SUMMARY
else
docker tag ${{ env.REGISTRY_STAGE }}/${{ env.REPOSITORY_STAGE }}:${{ env.TAG_SHA }} \
${{ env.REGISTRY_PROD }}/${{ env.REPOSITORY_PROD }}:workflow_dispatch_${{ matrix.variant }}
docker push ${{ env.REGISTRY_PROD }}/${{ env.REPOSITORY_PROD }}:workflow_dispatch_${{ matrix.variant }}
echo "- \`workflow_dispatch_${{ matrix.variant }}\`" >> $GITHUB_STEP_SUMMARY
fi
11 changes: 5 additions & 6 deletions docs/continuous-delivery-parallel-builds.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,8 @@ There are three GitHub Actions workflows for automated build+deploy to AWS. Thes

### Dev Workflow

**Note**: This workflow runs when a PR is opened or when an open PR is updated, _unless that PR is triggered by `depdendabot`_.

1. There is an initial job that runs and parses the `.aws-architecture` file and generates outputs that will drive the next phase.
1. The second phase of the workflow is a matrix strategy that will kick off two runners, one for each build. The runner is picked to match the CPU architecture of the requested build. That is, if the `gpu` key in the `.aws-architecture` file specifies `linux/amd64`, then the runner for the `gpu` container will be an `amd64`-based runner. If the `cpu` key in the `.aws-architecture` file specifies `linux/arm64` then the `cpu` container will be an `arm64`-based runner. This way, when Docker runs, it is running on the same architecture as the container it is trying to build.

Expand All @@ -45,11 +47,8 @@ The Stage workflow is the same as the Dev workflow, only the trigger is differen

### Prod Workflow

Similar to our shared workflows, the Prod workflow will
Similar to our shared workflows, the Prod workflow will run on a tagged release on the `main` branch. Different from our shared workflows, the workflow will first run a job to capture the CPU architecture information from the `.aws-architecture` file so that the subsequent job can find the correct tags on the containers in the Stage ECR Reository to push over to Prod.

1. Verify that the SHAs match between stage & prod
1. Download the images from the Stage ECR
1. Re-tag the images for Prod
1. Upload the images to Prod ECR
**Job 1**: Process the `.aws-architecture` file to capture the CPU architectures for the GPU containers and the CPU containers.

There is no need for a matrix or different runners since we aren't building anything.
**Job 2**: Use the output of **Job 1** to generate a matrix to run parallel jobs to pull containers from Stage, re-tag them for Prod, and then push them to Prod. Just like our shared workflows, the very first step is to check whether the commit SHA on `main` (where the tagged release is being applied) matches the commit SHA that was tagged onto the container in the Stage-Workloads ECR Repository.