-
Notifications
You must be signed in to change notification settings - Fork 387
Investigate effectiveness of keep_warm vs standard AWS Lambda cold start solutions #1445
Description
Summary
Zappa's keep_warm feature uses a CloudWatch Events rule (default: rate(4 minutes)) to periodically invoke the Lambda function via handler.keep_warm_callback, keeping one execution environment warm. While this has been a core feature since early versions, AWS has since introduced native cold start mitigation options that may be more effective, cost-efficient, or complementary.
This issue proposes investigating the real-world effectiveness of keep_warm and comparing it against current AWS-native solutions.
Current Implementation
keep_warm(default:true): Schedules a CloudWatch event to invoke the Lambda on a timerkeep_warm_expression(default:rate(4 minutes)): Controls invocation frequencykeep_warm_callback: Callslambda_handlerwith an empty event to trigger web app initialization- Only keeps one execution environment warm — concurrent requests still hit cold starts
AWS-Native Cold Start Solutions to Compare
1. Provisioned Concurrency (GA since Dec 2019)
- Pre-initializes a specified number of execution environments
- Eliminates cold starts for up to N concurrent requests
- Costs: ~$0.015/GB-hour (provisioned) + reduced per-request cost
- Can be combined with Application Auto Scaling
2. SnapStart (GA for Java since Nov 2022, Python since Nov 2024)
- Takes a snapshot of the initialized execution environment
- Restores from snapshot instead of full cold boot
- Reduces cold start from seconds to ~200-400ms for Python
- No additional cost beyond standard Lambda pricing
3. Lambda Function URLs / Response Streaming
- Not directly a cold start solution, but architectural alternatives that may affect warm behavior
4. ARM64 / Graviton2
- Faster init time and lower cost — can indirectly reduce cold start impact
Investigation Areas
- Measure actual cold start times with and without
keep_warmfor typical Zappa apps (Flask, Django, FastAPI) — Cold init ~1,020-1,080ms across all frameworks (512MB, Python 3.12). No difference between keep_warm on/off for cold start duration. - Quantify the gap
keep_warmleaves: concurrent cold starts, environment recycling despite warm pings — keep_warm only keeps 1 env warm. Cold start penalty is identical (~1,050ms init) when it occurs. keep_warm reduces frequency, not severity. - Cost comparison: CloudWatch event invocations vs Provisioned Concurrency vs SnapStart — keep_warm ~$0.05/mo, SnapStart $0.00, Provisioned Concurrency ~$5.53/mo per env. See detailed comment.
- SnapStart compatibility: Can Zappa leverage Python SnapStart? What changes would be needed? — Yes, Zappa already has
snap_startsetting but it has two bugs: snap_start setting not passed to create_lambda_function on initial deploy #1447 (not passed on initial deploy) and SnapStart requires version publish after config update, but zappa update publishes before enabling #1448 (version published before config applied). SnapStart reduces cold start by 45-72% (Flask 371ms, Django 281ms, FastAPI 590ms restore vs ~1,050ms init). - Provisioned Concurrency integration: Should Zappa offer a
provisioned_concurrencysetting as an alternative or complement tokeep_warm? — Yes. Provisioned Concurrency eliminates cold starts entirely (0ms init, 2-5ms warm duration). Zappa should add a setting that creates aliases and configures provisioned concurrency. - Hybrid approach: Would
keep_warm+ SnapStart orkeep_warm+ Provisioned Concurrency yield better results than any single approach? —keep_warm+ SnapStart is the best cost/benefit hybrid (~$0.05/mo): fewer cold starts + faster when they occur. Provisioned Concurrency is best for latency-sensitive production apps. - Documentation: Update guidance on when to use which approach based on traffic patterns — Recommendations: hobby/low-traffic → keep_warm + SnapStart; production → Provisioned Concurrency with auto-scaling; cost-sensitive production → keep_warm + SnapStart.
Expected Outcome
A clear recommendation (with data) on whether:
keep_warmremains the best default for Zappa users- Zappa should add first-class support for Provisioned Concurrency and/or SnapStart
keep_warmshould be deprecated in favor of native AWS solutions- A combination approach provides the best experience
Result: See investigation comment. All four questions answered: (1) yes, keep_warm remains a good default; (2) yes, fix SnapStart bugs and add Provisioned Concurrency support; (3) no, don't deprecate keep_warm — it's complementary; (4) keep_warm + SnapStart is the recommended default combination.
Related
keep_warmsetting inzappa_settings.jsonhandler.keep_warm_callbackinzappa/handler.py- CloudWatch Events scheduling in
zappa/cli.py - snap_start setting not passed to create_lambda_function on initial deploy #1447 —
snap_startnot passed on initial deploy - SnapStart requires version publish after config update, but zappa update publishes before enabling #1448 — SnapStart version ordering bug during update