- In this week, we learnt about a variety of technologies to progress with our ECS, ECR and Deployment knowledge. Some of the things we learnt include:
- Deploy an ECS Cluster using ECS Service Connect
- Deploy serverless containers using Fargate for the Backend and Frontend Application
- Route traffic to the frontend and backend on different subdomains using Application Load Balancer
- Securing our flask container
- Creating several bash utility scripts to easily work with serverless containers.
-
To test if we grasped the concepts provided to us through the meeting as well as the provided videos to aid us, we were given homeworks. They are:
- Provisioning an ECS cluster, creating an ECR repo then pushing both frontend and backend images, and deploying both apps to fargate.
- Provisioning and configuring Application Load Balancer along with target groups.
- Managed my domain using Route53, created an SSL certificate via ACM, setup a record set for naked domain to point to frontend-react-js, setup a record set for api subdomain to point to the backend-flask, and Configure CORS to only permit traffic from our domain.
- Secured Flask by not running in debug mode for production and Created Dockerfile specfically for production use case
- Implemented Refresh Token for Amazon Cognito
- Refactored bin directory to be top level
- Configured task defintions to contain x-ray and turn on Container Insights
- Used ruby generate out env dot files for docker using erb templates
- Changed Docker Compose to explicitly use a user-defined network
-
I will describe my work and the process in the order provided above.
Provisioning an ECS cluster, creating an ECR repo then pushing both frontend and backend images, and deploying both apps to fargate.
-
Provisioning an ECS Cluster
- To provision an ECS cluster, I followed the 3hr video that was posted on the exampro channel. The process was straight forward and I was able to provision an ECS cluster with no issues. I used the CLI to provision the cluster. The command I used was:
aws ecs create-cluster \ --cluster-name cruddur \ --service-connect-defaults namespace=cruddur
This command created a cluster named cruddur with empty services. The image below shows a successful run of the command:
Fig 1.0: Creating a cluster -
Creating an ECR repo
- To create an ECR repo, I used the CLI command which was in the week-6 video. The first thing I did was login to my AWS account using the CLI to access ECR. I used the command:
aws ecr get-login-password --region $AWS_DEFAULT_REGION | docker login --username AWS --password-stdin "$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com"
A successful login will look like this:
Fig 1.1: Logging inAfter logging in, we created a repository to store our images. I created a repository for three images, namely
cruddur-python
,backend-flask
andfrontend-react-js
. The command I used forcruddur-python
:aws ecr create-repository \ --repository-name cruddur-python \ --image-tag-mutability MUTABLE
for
backend-flask
:aws ecr create-repository \ --repository-name backend-flask \ --image-tag-mutability MUTABLE
and for
frontend-react-js
:aws ecr create-repository \ --repository-name frontend-react-js \ --image-tag-mutability MUTABLE
A successful run of these commands will look like this in AWS ECR console:
Fig 1.2: AWS ECR ConsoleAfter setting the repository, I had to update my ECR URL's to push each images to their respective ECR repositories. To set the correct ECR repository for all the repositories, I used the commands:
for
cruddur-python
:export ECR_PYTHON_URL="$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/cruddur-python"
for
backend-flask
:export ECR_BACKEND_FLASK_URL="$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/backend-flask"
for
frontend-react-js
:export ECR_FRONTEND_REACT_URL="$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/frontend-react-js"
-
Pushing both frontend and backend images
- After setting the URL's as described above, I pulled, tagged and pushed the images to ECR. For python and only python, I had to first pull the image from DockerHub. To pull the image, I used the command:
docker pull python:3.10-slim-buster
For the rest of the images, I had to build them using the
Dockerfile
located within each of the folders. To build the images, I used the commands:for
backend-flask
docker build -t backend-flask .
for
frontend-react-js
:docker build \ --build-arg REACT_APP_BACKEND_URL="https://4567-$GITPOD_WORKSPACE_ID.$GITPOD_WORKSPACE_CLUSTER_HOST" \ --build-arg REACT_APP_AWS_PROJECT_REGION="$AWS_DEFAULT_REGION" \ --build-arg REACT_APP_AWS_COGNITO_REGION="$AWS_DEFAULT_REGION" \ --build-arg REACT_APP_AWS_USER_POOLS_ID="ca-central-1_CQ4wDfnwc" \ --build-arg REACT_APP_CLIENT_ID="5b6ro31g97urk767adrbrdj1g5" \ -t frontend-react-js \ -f Dockerfile.prod \ .
The files for the backend and frontend are located backend-dockerfile and frontend-dockerfile respectively.
After building the images, I tagged them and pushed them to ECR. To tag the images, I used the commands:
for
backend-flask
docker tag backend-flask:latest $ECR_BACKEND_FLASK_URL:latest docker push $ECR_BACKEND_FLASK_URL:latest
for
frontend-react-js
:docker tag frontend-react-js:latest $ECR_FRONTEND_REACT_URL:latest docker push $ECR_FRONTEND_REACT_URL:latest
After tagging then pushing the images to their respective ECR repositories, we can see the following images in the repositories:
Fig 1.3: Python RepositoryFig 1.4: Backend RepositoryFig 1.5: Frontend Repository -
Deploying both apps to fargate
- Before deploying our apps to fargate, we had to configure roles, policies and task-definitions in that order. To configure roles, I used the command:
aws iam create-role \ --role-name CruddurServiceExecutionPolicy \ --assume-role-policy-document file://aws/policies/service-assume-role-execution-policy.json
Before executing the commands above, I had to create a file called
service-assume-role-execution-policy.json
in theaws/policies
folder. The content of the file is:{ "Version":"2012-10-17", "Statement":[{ "Action":["sts:AssumeRole"], "Effect":"Allow", "Principal":{ "Service":["ecs-tasks.amazonaws.com"] } }] }
The file can be found here
A successful execution of the command above will return the following:
Fig 1.6: Creating RoleAfter creating the role, I had to create a policy for the role. To create the policy, I used the command:
aws iam put-role-policy \ --policy-name CruddurServiceExecutionPolicy \ --role-name CruddurServiceExecutionRole \ --policy-document file://aws/policies/service-execution-policy.json
Before executing the commands above, I had to create a file called
service-execution-policy.json
in theaws/policies
folder. The content of the file is:{ "Version":"2012-10-17", "Statement":[{ "Effect": "Allow", "Action": [ "ssm:GetParameters", "ssm:GetParameter" ], "Resource": "arn:aws:ssm:us-east-1:706157350338:parameter/cruddur/backend-flask/*" }] }
The file can be found here
After creating the policy, I had to attach the policy to the role. To do that, I fetched the ARN the
put-role-policy
command returns. Then I used the command:aws iam attach-role-policy --policy-arn POLICY_ARN --role-name CruddurServiceExecutionRole
A successful execution of both commands above will attach the roles and policies together. Taking a look at the console will show the following:
Fig 1.7: Role and PolicyAfter creating and attaching the policy, I had to create a task-definition for both apps. Before that, I need to store my parameter details, which will be stored in the systems manager. To do that we used the CLI commands which were provided in the week-6 documentation. The list of CLI commands I used are:
aws ssm put-parameter --type "SecureString" --name "/cruddur/backend-flask/AWS_ACCESS_KEY_ID" --value $AWS_ACCESS_KEY_ID aws ssm put-parameter --type "SecureString" --name "/cruddur/backend-flask/AWS_SECRET_ACCESS_KEY" --value $AWS_SECRET_ACCESS_KEY aws ssm put-parameter --type "SecureString" --name "/cruddur/backend-flask/CONNECTION_URL" --value $PROD_CONNECTION_URL aws ssm put-parameter --type "SecureString" --name "/cruddur/backend-flask/ROLLBAR_ACCESS_TOKEN" --value $ROLLBAR_ACCESS_TOKEN aws ssm put-parameter --type "SecureString" --name "/cruddur/backend-flask/OTEL_EXPORTER_OTLP_HEADERS" --value "x-honeycomb-team=$HONEYCOMB_API_KEY"
After running these commands, we can check the console if we stored those parameters correctly. If we did, we should see the list as the following image:
Fig 1.8: SSM ParametersAfter storing the parameters, we created a new role called
CruddurTaskRole
which helps us use session managers and all. To create the role, we used the command:aws iam create-role \ --role-name CruddurTaskRole \ --assume-role-policy-document "{ \"Version\":\"2012-10-17\", \"Statement\":[{ \"Action\":[\"sts:AssumeRole\"], \"Effect\":\"Allow\", \"Principal\":{ \"Service\":[\"ecs-tasks.amazonaws.com\"] } }] }"
To create the policy, we used the command:
aws iam put-role-policy \ --policy-name SSMAccessPolicy \ --role-name CruddurTaskRole \ --policy-document "{ \"Version\":\"2012-10-17\", \"Statement\":[{ \"Action\":[ \"ssmmessages:CreateControlChannel\", \"ssmmessages:CreateDataChannel\", \"ssmmessages:OpenControlChannel\", \"ssmmessages:OpenDataChannel\" ], \"Effect\":\"Allow\", \"Resource\":\"*\" }] }"
Then we attached an iam role policy to the role to have cloudwatch full access and XRAY Daemon Write access. To do that, we used the commands:
aws iam attach-role-policy --policy-arn arn:aws:iam::aws:policy/CloudWatchFullAccess --role-name CruddurTaskRole aws iam attach-role-policy --policy-arn arn:aws:iam::aws:policy/AWSXRayDaemonWriteAccess --role-name CruddurTaskRole
A successful execution of these commands will attach the roles and policies together as well as give additional permissions to the role. Taking a look at the console will show the following:
Fig 1.9: Role and Policy for Task RoleAfter creating the roles and policies, we created a task definition for both our apps. To do that, we first create the files for the task definition. The files for both the frontend and backend can be found here. After creating the files, we used the following commands to register the task definitions:
For the
backend-flask
app:aws ecs register-task-definition --cli-input-json file://aws/task-definitions/backend-flask.json
For the
frontend-react-js
app:aws ecs register-task-definition --cli-input-json file://aws/task-definitions/frontend-react-js.json
After running the commands above, we can check the ECS Task Definitions console if we registered the task definitions correctly. If we did, we should see the list as the following image:
Fig 2.0: Task Definition - BackendFig 2.1: Task Definition - FrontendAfter creating the task definitions, we created a service via ClickOps. We created a service for both the frontend and backend. But before we do all these things, we first created a security group. To do that, we used the following command to get the default VPC:
export DEFAULT_VPC_ID=$(aws ec2 describe-vpcs \ --filters "Name=isDefault, Values=true" \ --query "Vpcs[0].VpcId" \ --output text)
Then we created a security group for the frontend and backend. To do that, we used the following commands:
export CRUD_SERVICE_SG=$(aws ec2 create-security-group \ --group-name "crud-srv-sg" \ --description "Security group for Cruddur services on ECS" \ --vpc-id $DEFAULT_VPC_ID \ --query "GroupId" --output text)
After creating the security group, we added the newly created security group via ClickOps to the new services we're creating. After finishing creating the services, we can check the ECS Cluster console if we created the services correctly. If we did, we should see the list as the following image:
Fig 2.0: Active ServicesFig 2.1: Healthy Services
- To provision and configure Application Load Balancer along with target groups, we first created a new load balancer via ClickOps. We created a load balancer with an ALB type. We named the load balancer
cruddur-alb
and then we attached the existing subnets then created a new security group namedcruddur-alb-sg
. We gave an inbound rule ofHTTP
andHTTPS
with a source of my own IP adddress so that the website would be accessed by me. After that, we updated the exixsting security group to get accessed by the newly created security group. After a long configuration process, we finally set up the load balancer. The ECS Load Balancer Console looks like the following:
Managed my domain using Route53, created an SSL certificate via ACM, setup a record set for naked domain to point to frontend-react-js, setup a record set for api subdomain to point to the backend-flask, and Configure CORS to only permit traffic from our domain.
- I bought my domain name from NameCheap, which has a different zone from AWS Route53. To manage my domain name from AWS, I had to create a hosted zone in Route53. To do that, I used Route53 console to register the naked domain directly. After creating the hosted zone, I take the nameserver that was provided by AWS and add them in my namecheap domain registrar by creating a custom nameserver there.
- After the propagation finished, we created additional records to the domain name. To do that, we first created a certificate by going to ACM(AWS Certificate Manager). By using Console, we requested for a certificate. After validating the certificate, it shows that the certificate was issued:
- After issuing the certificate, we added new records to the domain name. We added an
A-record
for the naked domain and a record for theapi
subdomain. After creating those records, the console will look like the following:
- After adding those records, we updated the load balancer to use the certificate. To do that, we went to the load balancer console and updated the listeners that were on port
4567
and3000
. we added theHTTP
andHTTPS
protocols and removed the existing ports. The load balancer console looks like the following:
- After updating the listener, we check if the website is working. We can check if the website is working by going to the domain name. I tried if the health route is working for my api subdomain, which worked. The result is the following:
- After checking the health route, we updated the cors for our backend task-definition. we changed the values from
*
to the domain names. The cors looks like the following:
.
.
.
"environment": [
{"name": "OTEL_SERVICE_NAME", "value": "backend-flask"},
{"name": "OTEL_EXPORTER_OTLP_ENDPOINT", "value": "https://api.honeycomb.io"},
{"name": "AWS_COGNITO_USER_POOL_ID", "value": "us-east-1_37QFkM89e"},
{"name": "AWS_COGNITO_USER_POOL_CLIENT_ID", "value": "6bg85ltcq257hve1mcktj0n2i2"},
{"name": "FRONTEND_URL", "value": "https://realmanny.me"},
{"name": "BACKEND_URL", "value": "https://api.realmanny.me"},
{"name": "AWS_DEFAULT_REGION", "value": "us-east-1"}
],
.
.
.
- After updating the cors, we registed the new task definition and updated the service. After updating the service, we updated the frontend buid command th have the new backend url. After building, tagging then pushing the new image, we tested if the website is working, which both the frontend and backend worked. The result is the following:
- To secure the flask, I updated the script that runs the flask app through the
Dockerfile
not run in debug mode. To do that, we added a newDockerfile.prod
and a new init-script so that we can separate between the development and production. Theinit-backend-dev.sh
looks like the following:
#!/bin/bash
python3 -m flask run --host=0.0.0.0 --port=4567 --debug
The script can be found here
and the init-backend-prod.sh
looks like the following:
#!/bin/bash
python3 -m flask run --host=0.0.0.0 --port=4567 --no-debug --no-debugger --no-reload
The script can be found here
- After creating the new scripts, we updated the
Dockerfile
and theDockerfile.prod
to use their respective scripts. The command withinDockerfile
that starts the flask app looks like the following:
.
.
.
# Run the command to start Flask
CMD [ "bash", "./init-backend-dev.sh" ]
The full Dockerfile can be found here
and the command within Dockerfile.prod
that starts the flask app looks like the following:
.
.
.
# Run the command to start Flask
CMD [ "bash", "./init-backend-prod.sh" ]
The full Dockerfile can be found here
- After updating these files, we built and tested the new image. After testing the new image, we concluded that the new image works. But we didn't tag and push the new image to the ECR. Will try to do this on my own and debug the app thouroughly.
- To implement the refresh token, we updated our
checkAuth.js
function to check if the token is expired. If the token is expired, we will use the refresh token to get a new access token. The updated function looks like the following:
import { Auth } from 'aws-amplify';
export async function getAccessToken(){
Auth.currentSession()
.then((cognito_user_session) => {
const access_token = cognito_user_session.accessToken.jwtToken
localStorage.setItem("access_token", access_token)
})
.catch((err) => console.log(err));
}
export async function checkAuth(setUser){
Auth.currentAuthenticatedUser({
// Optional, By default is false.
// If set to true, this call will send a
// request to Cognito to get the latest user data
bypassCache: false
})
.then((cognito_user) => {
console.log('cognito_user',cognito_user);
setUser({
display_name: cognito_user.attributes.name,
handle: cognito_user.attributes.preferred_username
})
return Auth.currentSession()
}).then((cognito_user_session) => {
console.log('cognito_user_session',cognito_user_session);
localStorage.setItem("access_token", cognito_user_session.accessToken.jwtToken)
})
.catch((err) => console.log(err));
};
The file can be found here
- After that, we updated the import statement of all the files that use the
checkAuth.js
function. The updated import statement looks like the following:
import { checkAuth, getAccessToken } from '../lib/CheckAuth'
- After importing the statements, we update the files that use that import to use the
getAccessToken
function. The updated codes are the following:
.
.
.
await getAccessToken()
const access_token = localStorage.getItem("access_token")
.
.
.
The updated files list can be found in the commit history here
The console should show the following if the tokens are accepted (working):
- To refactor the bin directory, we moved the
bin
directory to the top level of the project. The moved directory can be found here in this commit history. I added additional scripts to thebin
directory to make it easier to run the scripts. All the refactorings and the additions to the code can be found in the above commit history.
- To configure the task definitions to contain x-ray and turn on Container Insights, we updated the task definition files for both apps to contain the following segment:
.
.
.
{
"name": "xray",
"image": "public.ecr.aws/xray/aws-xray-daemon",
"essential": true,
"user": "1337",
"portMappings": [
{
"name": "xray",
"containerPort": 2000,
"protocol": "udp"
}
]
},
.
.
.
The updated task definition files can be found here for the frontend and here for the backend.
-
After updating the task definition files, we deployed the new task definitions to the ECS cluster using the script called
deploy
found in thebin/backend
andbin/frontend
directory. Thedeploy
script for the backend can be found here and the for the frontend, it can be found here. -
After deploying the new task definitions, we checked the ECS console to see if the new task definitions were deployed. The new task definitions were deployed successfully. The new task definitions can be seen in the following screenshots:
- After deploying the new task definitions, we enabled container insights for the ECS cluster. After using the app for a while, logs started showing up in the CloudWatch console. The logs can be seen in the following screenshot:
- To use ruby to generate env files for docker using erb templates, we created a new directory called
erb
in the top level directory which contains the filesbackend-flask.env.erb
andfrontend-react-js.env.erb
. The files can be found in the directory here. These files give the template to use for generating theenv
files for both the frontend as well as the backend apps. The scripts that generate theenv
files are calledgenerate-env
and can be found in here for the frontend, and here for the backend. After generating the env files, we updated thedocker-compose.yml
file to use the generated env files. A segment of the code looks like the following:
.
.
.
backend-flask:
container_name: "backend_flask"
env_file:
- backend-flask.env
.
.
.
- To change Docker Compose to explicitly use a user-defined network, we updated the
docker-compose.yml
file to use the following segment:
.
.
.
networks:
cruddur-net:
driver: bridge
name: cruddur-net
.
.
.
- After issuing this change, we update the network for each of the services to use the
cruddur-net
network.
The updated docker-compose.yml
file can be found here