Create a zip archive of several S3 files and upload it to an S3 bucket.
- Retrieve payloads from an SQS Queue.
- Create a Zip archive from a list of files.
- Optionally create a directory structure inside the zip.
- Upload the zip file to an S3 bucket.
- Notify a consumer once the archive is ready to be downloaded.
Note: the .files[*].destination
field is optional.
{
"files": [
{
"source": "example-directory/uuid1.png",
"destination": "album_1"
},
{
"source": "s3://foobar/example-directory/uuid2.png",
"destination": "album_1"
},
{
"source": "example-directory/uuid3.png",
"destination": "album_2"
}
],
"metadata" {
"key": "value"
}
}
Below is a table with each required and optional environment variables:
Variable | Description | Default value |
---|---|---|
ARCHIVE_BASE_URL |
Base path for the Archive URL links. | None |
SQS_DESTINATION_QUEUE |
SQS Queue to send message to downstream consumer. | None |
SQS_SOURCE_QUEUE |
SQS Queue to retrieve payloads from. | None |
S3_DESTINATION_BUCKET |
Target S3 bucket to store the archive files. | None |
S3_SOURCE_BUCKET |
Source S3 bucket that contains the files used to create the archive by default. | None |
S3_FILE_PREFIX |
The prefix that will be added to each archive created. | download |
APP_DEBUG |
Enable debug log level. | 0 |
DEV_MODE |
Use console rendering for logs instead of JSON rendering. | 0 |
This project is intended to be deployed as an ECS Service inside AWS. It requires the following permissions:
- Interact with an SQS Queue.
- Read from the source S3 bucket (S3 objects).
- Write to the destination S3 bucket (zip archives).
It should be able to scale based on the SQS Queue's number of message.
Localstack and archiver services are available through the docker-compose manifest at the top-level directory of this project.
It is currently setup to:
- configure 2 S3 buckets (source-images, zip-storage);
- have 2 SQS Queues (input-queue and input-queue-dlq DLQ configured for retry).
To ease configuration for local development, this projects relies on direnv
which
automatically loads .envrc
files and injects the found variables in the current environment context.
In order to start containers, you can run the following commands:
$ cp .envrc.dist .envrc
$ docker compose up -d
You can now rely on the http://localhost:4566/ endpoint with the fake credentials to send messages in the localstack queues.
Every time you want to run aws commands, ensure that endpoint_url is added for any localstack calls
aws s3 cp $path_to_your_file s3://source-images \
--endpoint-url http://localhost:4566
aws s3 ls --recursive s3://source-images
aws sqs send-message --queue-url http://localhost:4566/000000000000/input-queue \
--endpoint-url http://localhost:4566 --message-body \
'{ "files": [ { "source": "source.jpg", "destination": "album_1" } ], "metadata": {"user_email": "theo.massard@meero.com"}}'
Below is a high level flow chart of the archiver processing.
flowchart TD
S[Start program] --> I[Init AWS clients]
I --> SQS[Poll SQS Queue]
SQS --> R[Retrieve Payload]
subgraph "Process SQS payload"
R --> V[Validate Payload content]
V --> C[Create in memory archive file]
C -->|for each file in payload| F[Fetch S3 file]
F --> W[Write to in-memory archive file]
W -->|for each file in payload| F
W -->|once every file is processed| U[Upload archive to S3 bucket]
U --> G[Generate archive link for cloudfront distribution]
G --> N[Notify downstream component]
N --> A[Acknowledge SQS message]
end
The program has the following executive flow:
- Initialize the archiver worker:
- Instantiate the SQS and S3 boto3 clients.
- Start polling messages from the SQS Queue.
- Process each SQS Message (polling):
- Retrieve the message from SQS.
- Transform the payload into an easy to manipulate dataclass.
- Create an in-memory file like object used to create the zip file.
- For each file in the payload, download the s3 object and insert it in the zip file.
- Upload the zip file to the destination S3 bucket.
- Notify downstream consumers by sending an SQS message.
- Acknowledge the SQS message processing.
Upon adding new commits to both the main
, the application will
get packaged as a container image.
$ docker build -t archiver:local .