Investigate effectiveness of keep_warm vs standard AWS Lambda cold start solutions

## Summary

Zappa's `keep_warm` feature uses a CloudWatch Events rule (default: `rate(4 minutes)`) to periodically invoke the Lambda function via `handler.keep_warm_callback`, keeping one execution environment warm. While this has been a core feature since early versions, AWS has since introduced native cold start mitigation options that may be more effective, cost-efficient, or complementary.

This issue proposes investigating the real-world effectiveness of `keep_warm` and comparing it against current AWS-native solutions.

## Current Implementation

- **`keep_warm`** (default: `true`): Schedules a CloudWatch event to invoke the Lambda on a timer
- **`keep_warm_expression`** (default: `rate(4 minutes)`): Controls invocation frequency
- **`keep_warm_callback`**: Calls `lambda_handler` with an empty event to trigger web app initialization
- Only keeps **one** execution environment warm — concurrent requests still hit cold starts

## AWS-Native Cold Start Solutions to Compare

### 1. Provisioned Concurrency (GA since Dec 2019)
- Pre-initializes a specified number of execution environments
- Eliminates cold starts for up to N concurrent requests
- Costs: ~$0.015/GB-hour (provisioned) + reduced per-request cost
- Can be combined with Application Auto Scaling

### 2. SnapStart (GA for Java since Nov 2022, Python since Nov 2024)
- Takes a snapshot of the initialized execution environment
- Restores from snapshot instead of full cold boot
- Reduces cold start from seconds to ~200-400ms for Python
- No additional cost beyond standard Lambda pricing

### 3. Lambda Function URLs / Response Streaming
- Not directly a cold start solution, but architectural alternatives that may affect warm behavior

### 4. ARM64 / Graviton2
- Faster init time and lower cost — can indirectly reduce cold start impact

## Investigation Areas

- [x] **Measure actual cold start times** with and without `keep_warm` for typical Zappa apps (Flask, Django, FastAPI) — Cold init ~1,020-1,080ms across all frameworks (512MB, Python 3.12). No difference between keep_warm on/off for cold start duration.
- [x] **Quantify the gap** `keep_warm` leaves: concurrent cold starts, environment recycling despite warm pings — keep_warm only keeps 1 env warm. Cold start penalty is identical (~1,050ms init) when it occurs. keep_warm reduces frequency, not severity.
- [x] **Cost comparison**: CloudWatch event invocations vs Provisioned Concurrency vs SnapStart — keep_warm ~$0.05/mo, SnapStart $0.00, Provisioned Concurrency ~$5.53/mo per env. See [detailed comment](https://github.com/zappa/Zappa/issues/1445#issuecomment-4174215736).
- [x] **SnapStart compatibility**: Can Zappa leverage Python SnapStart? What changes would be needed? — Yes, Zappa already has `snap_start` setting but it has two bugs: #1447 (not passed on initial deploy) and #1448 (version published before config applied). SnapStart reduces cold start by 45-72% (Flask 371ms, Django 281ms, FastAPI 590ms restore vs ~1,050ms init).
- [x] **Provisioned Concurrency integration**: Should Zappa offer a `provisioned_concurrency` setting as an alternative or complement to `keep_warm`? — Yes. Provisioned Concurrency eliminates cold starts entirely (0ms init, 2-5ms warm duration). Zappa should add a setting that creates aliases and configures provisioned concurrency.
- [x] **Hybrid approach**: Would `keep_warm` + SnapStart or `keep_warm` + Provisioned Concurrency yield better results than any single approach? — `keep_warm` + SnapStart is the best cost/benefit hybrid (~$0.05/mo): fewer cold starts + faster when they occur. Provisioned Concurrency is best for latency-sensitive production apps.
- [x] **Documentation**: Update guidance on when to use which approach based on traffic patterns — Recommendations: hobby/low-traffic → keep_warm + SnapStart; production → Provisioned Concurrency with auto-scaling; cost-sensitive production → keep_warm + SnapStart.

## Expected Outcome

A clear recommendation (with data) on whether:
1. `keep_warm` remains the best default for Zappa users
2. Zappa should add first-class support for Provisioned Concurrency and/or SnapStart
3. `keep_warm` should be deprecated in favor of native AWS solutions
4. A combination approach provides the best experience

**Result**: See [investigation comment](https://github.com/zappa/Zappa/issues/1445#issuecomment-4174215736). All four questions answered: (1) yes, keep_warm remains a good default; (2) yes, fix SnapStart bugs and add Provisioned Concurrency support; (3) no, don't deprecate keep_warm — it's complementary; (4) keep_warm + SnapStart is the recommended default combination.

## Related

- `keep_warm` setting in `zappa_settings.json`
- `handler.keep_warm_callback` in `zappa/handler.py`
- CloudWatch Events scheduling in `zappa/cli.py`
- #1447 — `snap_start` not passed on initial deploy
- #1448 — SnapStart version ordering bug during update

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate effectiveness of keep_warm vs standard AWS Lambda cold start solutions #1445

Summary

Current Implementation

AWS-Native Cold Start Solutions to Compare

1. Provisioned Concurrency (GA since Dec 2019)

2. SnapStart (GA for Java since Nov 2022, Python since Nov 2024)

3. Lambda Function URLs / Response Streaming

4. ARM64 / Graviton2

Investigation Areas

Expected Outcome

Related

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Investigate effectiveness of keep_warm vs standard AWS Lambda cold start solutions #1445

Description

Summary

Current Implementation

AWS-Native Cold Start Solutions to Compare

1. Provisioned Concurrency (GA since Dec 2019)

2. SnapStart (GA for Java since Nov 2022, Python since Nov 2024)

3. Lambda Function URLs / Response Streaming

4. ARM64 / Graviton2

Investigation Areas

Expected Outcome

Related

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions