Skip to content

Investigate effectiveness of keep_warm vs standard AWS Lambda cold start solutions #1445

@monkut

Description

@monkut

Summary

Zappa's keep_warm feature uses a CloudWatch Events rule (default: rate(4 minutes)) to periodically invoke the Lambda function via handler.keep_warm_callback, keeping one execution environment warm. While this has been a core feature since early versions, AWS has since introduced native cold start mitigation options that may be more effective, cost-efficient, or complementary.

This issue proposes investigating the real-world effectiveness of keep_warm and comparing it against current AWS-native solutions.

Current Implementation

  • keep_warm (default: true): Schedules a CloudWatch event to invoke the Lambda on a timer
  • keep_warm_expression (default: rate(4 minutes)): Controls invocation frequency
  • keep_warm_callback: Calls lambda_handler with an empty event to trigger web app initialization
  • Only keeps one execution environment warm — concurrent requests still hit cold starts

AWS-Native Cold Start Solutions to Compare

1. Provisioned Concurrency (GA since Dec 2019)

  • Pre-initializes a specified number of execution environments
  • Eliminates cold starts for up to N concurrent requests
  • Costs: ~$0.015/GB-hour (provisioned) + reduced per-request cost
  • Can be combined with Application Auto Scaling

2. SnapStart (GA for Java since Nov 2022, Python since Nov 2024)

  • Takes a snapshot of the initialized execution environment
  • Restores from snapshot instead of full cold boot
  • Reduces cold start from seconds to ~200-400ms for Python
  • No additional cost beyond standard Lambda pricing

3. Lambda Function URLs / Response Streaming

  • Not directly a cold start solution, but architectural alternatives that may affect warm behavior

4. ARM64 / Graviton2

  • Faster init time and lower cost — can indirectly reduce cold start impact

Investigation Areas

  • Measure actual cold start times with and without keep_warm for typical Zappa apps (Flask, Django, FastAPI) — Cold init ~1,020-1,080ms across all frameworks (512MB, Python 3.12). No difference between keep_warm on/off for cold start duration.
  • Quantify the gap keep_warm leaves: concurrent cold starts, environment recycling despite warm pings — keep_warm only keeps 1 env warm. Cold start penalty is identical (~1,050ms init) when it occurs. keep_warm reduces frequency, not severity.
  • Cost comparison: CloudWatch event invocations vs Provisioned Concurrency vs SnapStart — keep_warm ~$0.05/mo, SnapStart $0.00, Provisioned Concurrency ~$5.53/mo per env. See detailed comment.
  • SnapStart compatibility: Can Zappa leverage Python SnapStart? What changes would be needed? — Yes, Zappa already has snap_start setting but it has two bugs: snap_start setting not passed to create_lambda_function on initial deploy #1447 (not passed on initial deploy) and SnapStart requires version publish after config update, but zappa update publishes before enabling #1448 (version published before config applied). SnapStart reduces cold start by 45-72% (Flask 371ms, Django 281ms, FastAPI 590ms restore vs ~1,050ms init).
  • Provisioned Concurrency integration: Should Zappa offer a provisioned_concurrency setting as an alternative or complement to keep_warm? — Yes. Provisioned Concurrency eliminates cold starts entirely (0ms init, 2-5ms warm duration). Zappa should add a setting that creates aliases and configures provisioned concurrency.
  • Hybrid approach: Would keep_warm + SnapStart or keep_warm + Provisioned Concurrency yield better results than any single approach? — keep_warm + SnapStart is the best cost/benefit hybrid (~$0.05/mo): fewer cold starts + faster when they occur. Provisioned Concurrency is best for latency-sensitive production apps.
  • Documentation: Update guidance on when to use which approach based on traffic patterns — Recommendations: hobby/low-traffic → keep_warm + SnapStart; production → Provisioned Concurrency with auto-scaling; cost-sensitive production → keep_warm + SnapStart.

Expected Outcome

A clear recommendation (with data) on whether:

  1. keep_warm remains the best default for Zappa users
  2. Zappa should add first-class support for Provisioned Concurrency and/or SnapStart
  3. keep_warm should be deprecated in favor of native AWS solutions
  4. A combination approach provides the best experience

Result: See investigation comment. All four questions answered: (1) yes, keep_warm remains a good default; (2) yes, fix SnapStart bugs and add Provisioned Concurrency support; (3) no, don't deprecate keep_warm — it's complementary; (4) keep_warm + SnapStart is the recommended default combination.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions