Scale-to-zero NAT instances for AWS. Stop paying for NAT when nothing is running.
nat-zero is a Terraform module that replaces always-on NAT with on-demand NAT instances. When a workload launches in a private subnet, a NAT instance starts automatically. When the last workload stops, the NAT shuts down and its Elastic IP is released. Idle cost: ~$0.80/month per AZ.
Built on fck-nat AMIs. Orchestrated by a single Go Lambda (~55 ms cold start, 29 MB memory). Integration-tested against real AWS infrastructure on every PR.
AZ-A (active) AZ-B (idle)
┌──────────────────┐ ┌──────────────────┐
│ Workloads │ │ No workloads │
│ ↓ route table │ │ No NAT instance │
│ Private ENI │ │ No EIP │
│ ↓ │ │ │
│ NAT Instance │ │ Cost: ~$0.80/mo │
│ ↓ │ │ (EBS only) │
│ Public ENI + EIP │ │ │
│ ↓ │ └──────────────────┘
│ Internet Gateway │
└──────────────────┘
▲
EventBridge → Lambda (reconciler, concurrency=1)
| State | nat-zero | fck-nat | NAT Gateway |
|---|---|---|---|
| Idle (no workloads) | ~$0.80/mo | ~$7-8 | ~$36+ |
| Active (workloads running) | ~$7-8 | ~$7-8 | ~$36+ |
AWS NAT Gateway costs ~$36/month per AZ even when idle. fck-nat brings that to ~$7-8/month, but the instance and EIP run 24/7. nat-zero releases the Elastic IP when idle, avoiding the $3.60/month public IPv4 charge.
Best for dev/staging environments, CI/CD runners, batch jobs, and side projects where workloads run intermittently.
An EventBridge rule captures EC2 instance state changes. A Lambda function (concurrency=1, single writer) runs a reconciliation loop on each event:
- Observe — query workloads, NAT instances, and EIPs in the AZ
- Decide — compare actual state to desired state
- Act — take at most one mutating action, then return
The event is just a trigger — the reconciler always computes the correct action from current state. With reserved_concurrent_executions=1, events are processed sequentially, eliminating race conditions.
| Workloads? | NAT State | Action |
|---|---|---|
| Yes | None / terminated | Create NAT |
| Yes | Stopped | Start NAT |
| Yes | Stopping | Wait |
| Yes | Running, no EIP | Attach EIP |
| No | Running / pending | Stop NAT |
| No | Stopped, has EIP | Release EIP |
| — | Multiple NATs | Terminate duplicates |
Each NAT uses two persistent ENIs (public + private) created by Terraform. They survive stop/start cycles, keeping route tables intact.
See Architecture for the full reconciliation model and event flow diagrams.
module "nat_zero" {
source = "github.com/MachineDotDev/nat-zero"
name = "my-nat"
vpc_id = module.vpc.vpc_id
availability_zones = ["us-east-1a", "us-east-1b"]
public_subnets = module.vpc.public_subnets
private_subnets = module.vpc.private_subnets
private_route_table_ids = module.vpc.private_route_table_ids
private_subnets_cidr_blocks = module.vpc.private_subnets_cidr_blocks
tags = { Environment = "dev" }
}See Examples for spot instances, custom AMIs, and building from source.
| Scenario | Time to connectivity |
|---|---|
| First workload (cold create) | ~10.7 s |
| Restart from stopped | ~8.5 s |
| NAT already running | Instant |
The Lambda is a compiled Go ARM64 binary. Cold start: 55 ms. Typical invocation: 400-600 ms. Peak memory: 29 MB. The startup delay is dominated by EC2 instance boot, not the Lambda.
See Performance for detailed timings and cost breakdowns.
- EventBridge scope: Captures all EC2 state changes in the account; Lambda filters by VPC ID.
- Startup delay: First workload in an idle AZ waits ~10 seconds for internet. Design scripts to retry outbound connections.
- Dual ENI: Persistent public + private ENIs survive stop/start cycles.
- DLQ: Failed Lambda invocations go to an SQS dead letter queue.
- Clean destroy: A cleanup action terminates NAT instances before
terraform destroyremoves ENIs. - Config versioning: Changing AMI or instance type auto-replaces NAT instances on next workload event.
- EC2 events only: Currently nat-zero responds only to EC2 instance state changes. If you have a use case for other event sources (ECS tasks, Lambda, etc.), PRs are welcome.
| Name | Version |
|---|---|
| terraform | >= 1.3 |
| aws | >= 5.0 |
| null | >= 3.0 |
| time | >= 0.9 |
| Name | Version |
|---|---|
| aws | >= 5.0 |
| null | >= 3.0 |
| time | >= 0.9 |
No modules.
| Name | Description | Type | Default | Required |
|---|---|---|---|---|
| ami_id | Explicit AMI ID to use (overrides AMI lookup entirely) | string |
null |
no |
| availability_zones | List of availability zones to deploy NAT instances in | list(string) |
n/a | yes |
| block_device_size | Size in GB of the root EBS volume | number |
10 |
no |
| build_lambda_locally | Build the Lambda binary from Go source instead of downloading a pre-compiled release. Requires Go and zip installed locally. | bool |
false |
no |
| custom_ami_name_pattern | AMI name pattern when use_fck_nat_ami is false | string |
null |
no |
| custom_ami_owner | AMI owner account ID when use_fck_nat_ami is false | string |
null |
no |
| enable_logging | Create a CloudWatch log group for the Lambda function | bool |
true |
no |
| ignore_tag_key | Tag key used to mark instances the Lambda should ignore | string |
"nat-zero:ignore" |
no |
| ignore_tag_value | Tag value used to mark instances the Lambda should ignore | string |
"true" |
no |
| instance_type | Instance type for the NAT instance | string |
"t4g.nano" |
no |
| lambda_binary_url | URL to the pre-compiled Go Lambda zip. Updated automatically by CI. | string |
"https://github.com/MachineDotDev/nat-zero/releases/download/nat-zero-lambda-latest/lambda.zip" |
no |
| lambda_memory_size | Memory allocated to the Lambda function in MB (also scales CPU proportionally) | number |
128 |
no |
| log_retention_days | CloudWatch log retention in days (only used when enable_logging is true) | number |
14 |
no |
| market_type | Whether to use spot or on-demand instances | string |
"on-demand" |
no |
| name | Name prefix for all resources created by this module | string |
n/a | yes |
| nat_tag_key | Tag key used to identify NAT instances | string |
"nat-zero:managed" |
no |
| nat_tag_value | Tag value used to identify NAT instances | string |
"true" |
no |
| private_route_table_ids | Route table IDs for the private subnets (one per AZ) | list(string) |
n/a | yes |
| private_subnets | Private subnet IDs (one per AZ) for NAT instance private ENIs | list(string) |
n/a | yes |
| private_subnets_cidr_blocks | CIDR blocks for the private subnets (one per AZ, used in security group rules) | list(string) |
n/a | yes |
| public_subnets | Public subnet IDs (one per AZ) for NAT instance public ENIs | list(string) |
n/a | yes |
| tags | Additional tags to apply to all resources | map(string) |
{} |
no |
| use_fck_nat_ami | Use the public fck-nat AMI. Set to false to use a custom AMI. | bool |
true |
no |
| vpc_id | The VPC ID where NAT instances will be deployed | string |
n/a | yes |
| Name | Description |
|---|---|
| eventbridge_rule_arn | ARN of the EventBridge rule capturing EC2 state changes |
| lambda_function_arn | ARN of the nat-zero Lambda function |
| lambda_function_name | Name of the nat-zero Lambda function |
| launch_template_ids | Launch template IDs for NAT instances (one per AZ) |
| nat_private_eni_ids | Private ENI IDs for NAT instances (one per AZ) |
| nat_public_eni_ids | Public ENI IDs for NAT instances (one per AZ) |
| nat_security_group_ids | Security group IDs for NAT instances (one per AZ) |
Contributions welcome. Please open an issue or submit a pull request.
MIT