S3-backed serverless PyPI.
Requests to your PyPI server will be proxied through a Lambda function that pulls content from an S3 bucket and responds with the same HTML content that you might find in a conventional PyPI server.
Requests to the base path (eg, /simple/
) will respond with the contents of an index.html
file at the root of your S3 bucket.
Requests to the package index (eg, /simple/fizz/
) will dynamically generate an HTML file based on the contents of keys under that namespace (eg, s3://your-bucket/fizz/
). URLs for package downloads are presigned S3 URLs with a default lifespan of 15 minutes.
Package uploads/removals on S3 will trigger a Lambda function that reindexes the bucket and generates a new index.html
at the root. This is done to save time when querying the base path when your bucket contains a multitude of packages.
As of v7 users are expected to bring-your-own REST API (v1). This gives users greater flexibility in choosing how their API is set up.
The most simplistic setup is as follows:
#######################
# SERVERLESS PYPI #
#######################
module "serverless_pypi" {
source = "amancevice/serverless-pypi/aws"
version = "~> 7"
api_execution_arn = aws_api_gateway_rest_api.pypi.execution_arn
api_id = aws_api_gateway_rest_api.pypi.id
api_root_resource_id = aws_api_gateway_rest_api.pypi.root_resource_id
event_rule_name = "serverless-pypi-reindex"
iam_role_name = "serverless-pypi"
lambda_api_fallback_index_url = "https://pypi.org/simple/"
lambda_api_function_name = "serverless-pypi-api"
lambda_reindex_function_name = "serverless-pypi-reindex"
s3_bucket_name = "serverless-pypi-us-west-2"
# etc …
}
################
# REST API #
################
resource "aws_api_gateway_rest_api" "pypi" {
description = "Serverless PyPI example"
name = "serverless-pypi"
endpoint_configuration { types = ["REGIONAL"] }
}
resource "aws_api_gateway_deployment" "pypi" {
rest_api_id = aws_api_gateway_rest_api.pypi.id
triggers = { redeployment = module.serverless_pypi.api_deployment_trigger }
lifecycle { create_before_destroy = true }
}
resource "aws_api_gateway_stage" "simple" {
deployment_id = aws_api_gateway_deployment.pypi.id
rest_api_id = aws_api_gateway_rest_api.pypi.id
stage_name = "simple"
}
This tool is highly opinionated about how your S3 bucket is organized. Your root key space should only contain the auto-generated index.html
and "directories" of your PyPI packages.
Packages should exist one level deep in the bucket where the prefix is the name of the project.
Example:
s3://your-bucket/
├── index.html
├── my-cool-package/
│ ├── my-cool-package-0.1.2.tar.gz
│ ├── my-cool-package-1.2.3.tar.gz
│ └── my-cool-package-2.3.4.tar.gz
└── my-other-package/
├── my-other-package-0.1.2.tar.gz
├── my-other-package-1.2.3.tar.gz
└── my-other-package-2.3.4.tar.gz
You can configure your PyPI index to fall back to a different PyPI in the event that a package is not found in your bucket.
Without configuring a fallback index URL the following pip install
command will surely fail (assuming you don't have boto3
and all its dependencies in your S3 bucket):
pip install boto3 --index-url https://my.private.pypi/simple/
Instead, if you configure a fallback index URL in the terraform module, then requests for a pip that isn't found in the bucket will be re-routed to the fallback.
module "serverless_pypi" {
source = "amancevice/serverless-pypi/aws"
version = "~> 7"
lambda_api_fallback_index_url = "https://pypi.org/simple/"
# etc …
}
Please note that this tool provides NO authentication layer for your PyPI index out of the box. This is difficult to implement because pip
is currently not very forgiving with any kind of auth pattern outside Basic Auth.
Using a REST API configured for a private VPC is the easiest solution to this problem, but you could also write a custom authorizer for your API as well.