Terraform stack to deploy Jenkins on ECS Fargate with Jenkins configuration stored in EFS and agents on Fargate. It can be used as a starting point to build a production ready Jenkins on AWS.
More details can be found on my blog post (in French).
It uses a docker image based on the official Jenkins. See docker/
folder.
The following main resources will be created:
- An application load balancer in front of Jenkins.
- A network load balancer for Agent -> Controller communication. For more information about how Controller <-> Agents communication works, see this page.
- An EFS to store Jenkins configuration.
- An S3 bucket used by the Jenkins Configuration as code plugin to get the configuration generated by Terraform.
- Two log groups for the Controller and agents logs
- A VPC with public and private subnets configured properly (route table, nat gateways...)
- An IAM user with the proper policies to run Terraform on the following services: EC2, ECS, IAM, S3, Cloudwatch, EFS, Route53 et ACM.
- A recent version of Terraform ( > 0.12.20)
The only required Terraform variables are:
vpc_id
: the VPC IDpublic_subnets
: public subnets IDsprivate_subnets
: private subnets IDs
See variables.tf for all the possible variables to override.
AWS authentication:
export AWS_PROFILE=...
# or
export AWS_SECRET_ACCESS_KEY=""
export AWS_ACCESS_KEY_ID=""
Deployment:
export TF_VAR_vpc_id="vpc-123456789"
export TF_VAR_private_subnets='["private-subnet-a", "private-subnet-b", "private-subnet-c"]'
export TF_VAR_public_subnets='["public-subnet-a", "public--subnet-b", "public-subnet-c"]'
terraform init
terraform apply
Get the default admin credentials and connect to the controller url in the output jenkins_public_url
:
terraform output jenkins_credentials
The first time you access the controller, the Getting started
guide will ask you to install the recommended plugins.
Install them and restart the controller.
To speed up the startup of the controller and the agents, you can use the SOCI feature (Seekable OCI
image config) by setting the input variable soci.enabled
to true
(see below for more details about the input
variables).
This requires a recent version of the Fargate platform (>= 1.4.0). When you set this to true, the index builder will build the SOCI indexes locally and push them to ECR. This can take a while (around ~5 minutes) and requires Docker to be installed on your machine and be able to run it in privileged mode.
To compare the startup time of the tasks, a local module modules/ecs-events-capture is used to capture the relevant ECS Task events in a CloudWatch Log Group. After some runs, you can run the Python script (check the README of the module).
Here are some numbers in the tables below for the controller and the agent with the following images built from here:
- Controller version 2.433: 1.12 GB
- Agent version 3192.v713e3b_039fb_e-4-alpine-jdk17: 0.315 GB
- The times are in seconds and represent the difference between the creation and the start time of the task.
- Number of runs: 16.
The controller:
Controller | min_start_time | max_start_time | mean_start_time | median_start_time |
---|---|---|---|---|
Without SOCI | 41.633 | 56.347 | 49.5134 | 51.6065 |
With SOCI | 24.048 | 36.474 | 31.4392 | 32.9275 |
Diff | -42.23% | -35.27% | -36.5% | -36.2% |
The agent:
Agent | min_start_time | max_start_time | mean_start_time | median_start_time |
---|---|---|---|---|
Without SOCI | 16.835 | 22.204 | 18.7357 | 18.6275 |
With SOCI | 12.759 | 21.08 | 15.1008 | 14.3945 |
Diff | -24.21% | -5.06% | -19.4% | -22.72% |
Note that SOCI only works with the private ECR repositories at the moment.
In a nutshell, on average, the start time of the controller and the agent are reduced by 36.5% and 19.4% respectively.
For more information about SOCI, see the following links:
- Under the hood: Lazy Loading Container Images with Seekable OCI and AWS Fargate
- https://docs.aws.amazon.com/AmazonECS/latest/userguide/container-considerations.html
- AWS Fargate Enables Faster Container Startup using Seekable OCI
Name | Version |
---|---|
terraform | >= 1 |
aws | ~> 5 |
random | >= 3 |
Name | Version |
---|---|
aws | ~> 5 |
random | >= 3 |
terraform | n/a |
Name | Source | Version |
---|---|---|
ecs_events | ./modules/ecs-events-capture | n/a |
Name | Description | Type | Default | Required |
---|---|---|---|---|
private_subnets | Private subnets to deploy the Jenkins controller | set(string) |
n/a | yes |
public_subnets | Public subnets to deploy the load balancer | set(string) |
n/a | yes |
vpc_id | The VPC id | string |
n/a | yes |
agent_docker_image | Docker image to use for the example agent. See: https://hub.docker.com/r/jenkins/inbound-agent/ | string |
"elmhaidara/jenkins-alpine-agent-aws:3192.v713e3b_039fb_e-4-alpine-jdk17" |
no |
agents_cpu_memory | CPU and memory for the agent example. Note that all combinations are not supported with Fargate. | object({ |
{ |
no |
agents_log_retention_days | Retention days for Agents log group | number |
5 |
no |
allowed_ip_addresses | List of allowed IP addresses to access the controller from the ALB | set(string) |
[ |
no |
aws_region | The AWS region in which deploy the resources | string |
"eu-west-1" |
no |
capture_ecs_events | Whether to capture ECS events in CloudWatch Logs | bool |
true |
no |
controller_cpu_memory | CPU and memory for Jenkins controller. Note that all combinations are not supported with Fargate. | object({ |
{ |
no |
controller_deployment_percentages | The Min and Max percentages of Controller instance to keep when updating the service. See https://docs.aws.amazon.com/AmazonECS/latest/developerguide/update-service.html. These default values cause the ECS to stop the controller before starting a new one. This is to avoid having 2 controllers running at the same time. |
object({ |
{ |
no |
controller_docker_image | Jenkins Controller docker image to use | string |
"elmhaidara/jenkins-aws-fargate:2.433" |
no |
controller_docker_user_uid_gid | Jenkins User/Group ID inside the container. One should consider using access point. | number |
0 |
no |
controller_java_opts | JENKINS_OPTS to pass to the controller | string |
"" |
no |
controller_jnlp_port | JNLP port used by Jenkins agent to communicate with the controller | number |
50000 |
no |
controller_listening_port | Jenkins container listening port | number |
8080 |
no |
controller_log_retention_days | Retention days for Controller log group | number |
14 |
no |
controller_num_executors | Set this to a number > 0 to be able to build on controller (NOT RECOMMENDED) | number |
0 |
no |
default_tags | Default tags to apply to the resources | map(string) |
{ |
no |
efs_burst_credit_balance_threshold | Threshold below which the metric BurstCreditBalance associated alarm will be triggered. Expressed in bytes | number |
1154487209164 |
no |
efs_performance_mode | EFS performance mode. Valid values: generalPurpose or maxIO | string |
"generalPurpose" |
no |
efs_provisioned_throughput_in_mibps | The throughput, measured in MiB/s, that you want to provision for the file system. Only applicable with throughput_mode set to provisioned. | number |
null |
no |
efs_throughput_mode | Throughput mode for the file system. Valid values: bursting, provisioned. When using provisioned, also set provisioned_throughput_in_mibps | string |
"bursting" |
no |
fargate_platform_version | Fargate platform version to use. Must be >= 1.4.0 to be able to use Fargate | string |
"1.4.0" |
no |
route53_subdomain | The subdomain to use for Jenkins Controller. Used when var.route53_zone_name is not empty | string |
"jenkins" |
no |
route53_zone_name | A Route53 zone name to use to create a DNS record for the Jenkins Controller. Required for HTTPs. | string |
"" |
no |
soci | Seekable OCI image config. See https://aws.amazon.com/fr/blogs/aws/aws-fargate-enables-faster-container-startup-using-seekable-oci/. If enabled, Terraform will create two ECR repositories (one for the controller and one for the agent), push the images to ECR (from the default images in Dockerhub), build the SOCI indexes and push them to ECR as well. As such, you need to have Docker installed on your machine and be able to run it in privileged mode. You can optionally build the images and their index yourself, push them to ECR and update the variables controller_docker_image andcontroller_docker_image (set enabled to false in this case). See https://github.com/aws-samples/aws-fargate-seekable-oci-toolbox/blob/main/containerized-index-builder/README.md.This variable is just a convenient way to do it from Terraform. Prefer using the lambda function to build the index: https://github.com/aws-ia/cfn-ecr-aws-soci-index-builder. |
object({ |
{} |
no |
target_groups_deregistration_delay | Amount of time for ALB to wait before changing the state of a deregistering target from draining to unused. It has a direct impact on the time it takes to run the controller. | number |
10 |
no |
Name | Description |
---|---|
agents_log_group | Jenkins agents log group |
controller_config_on_s3 | Jenkins controller configuration file on S3 |
controller_log_group | Jenkins controller log group |
ecr_images | ECR images when SOCI is enabled |
ecs_events_log_group_name | ECS events log group |
jenkins_credentials | Credentials to access Jenkins via the public URL |
jenkins_public_url | Public URL to access Jenkins |