Skip to content

Add proactive memory monitoring to prevent Lambda OOM deaths#2272

Merged
hiroshinishio merged 2 commits intomainfrom
wes
Feb 18, 2026
Merged

Add proactive memory monitoring to prevent Lambda OOM deaths#2272
hiroshinishio merged 2 commits intomainfrom
wes

Conversation

@hiroshinishio
Copy link
Collaborator

Summary

  • When Lambda OOMs during code generation, the process is killed instantly - no cleanup, no CI trigger, no user notification. The PR just sits there with no feedback.
  • Added is_lambda_oom_approaching() that checks peak RSS memory each agent loop iteration via resource.getrusage. When usage crosses 1792 MB (87.5% of 2048 MB limit), the agent bails out gracefully - same pattern as the existing timeout detection.
  • Integrated into should_bail() with priority: timeout > OOM > PR closed > branch deleted.

Social Media Post (GitAuto)

When Lambda runs out of memory, AWS kills the process instantly. No cleanup runs, no CI gets triggered, no comment gets posted. The PR just goes silent. We added memory monitoring to the agent loop - same pattern as our existing timeout check. If RSS usage crosses 87% of the limit, the agent stops gracefully, pushes what it has, and lets CI run.

Social Media Post (Wes)

Lambda OOM is the worst failure mode. There's no signal handler, no graceful shutdown, no "almost out of memory" warning. AWS just kills your process. For us that meant the PR went silent - no CI, no comment, nothing. Borrowed our own timeout detection pattern: check resource.getrusage each loop iteration, bail at 87% of the limit. Simple but it only catches gradual growth, not sudden spikes. We'll see if that's enough.

@hiroshinishio hiroshinishio self-assigned this Feb 18, 2026
@hiroshinishio hiroshinishio merged commit 56b1257 into main Feb 18, 2026
1 check passed
@hiroshinishio hiroshinishio deleted the wes branch February 18, 2026 22:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant