fix(serverless-hono): defer waitUntil cleanup to prevent tool crashes…#1191
Conversation
🦋 Changeset detectedLatest commit: 1c9a310 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
✅ Files skipped from review due to trivial changes (2)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughDefers clearing of the global Changes
Sequence Diagram(s)sequenceDiagram
participant Handler as Request Handler
participant Global as globalThis.___voltagent_wait_until
participant Platform as Platform.waitUntil
participant Tools as Background / Streaming Tools
Handler->>Global: install wrapper that forwards to Platform.waitUntil
Handler->>Tools: return Response (streaming/background tasks continue)
Handler->>Handler: finally -> deferCleanup(context, cleanup)
note right of Handler: deferCleanup checks context.waitUntil
alt context.waitUntil is callable
Handler->>Platform: schedule cleanup via waitUntil(promises)
Tools->>Global: register promises via wrapper
Platform->>Platform: wait for all promises (including late registrations)
Platform->>Handler: invoke cleanup after settle
else no waitUntil
Handler->>Handler: call cleanup() synchronously
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
packages/serverless-hono/src/utils/defer-cleanup.spec.ts (1)
76-79: Avoidas anyin the test case.Line 78 weakens type safety in violation of the coding guideline. Replace with
as unknown as WaitUntilContextto safely test the scenario wherewaitUntilis not a function while preserving type information.♻️ Suggested change
- const context = { waitUntil: "not a function" } as any; + const context = { waitUntil: "not a function" } as unknown as WaitUntilContext;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/serverless-hono/src/utils/defer-cleanup.spec.ts` around lines 76 - 79, Replace the unsafe cast in the test "should handle context with non-function waitUntil" by changing the context declaration to use a double-cast through unknown into the proper WaitUntilContext type (i.e., replace "as any" with "as unknown as WaitUntilContext") so the test still supplies waitUntil: "not a function" while preserving TypeScript type-safety; update the variable named context in defer-cleanup.spec.ts accordingly and keep the rest of the test (cleanup = vi.fn()) intact.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/serverless-hono/src/serverless-provider.ts`:
- Line 76: The code uses unsafe "as any" casts when calling deferCleanup and
withWaitUntil; instead narrow the unknown context values to WaitUntilContext |
null | undefined before passing them. Locate the Cloudflare branch where
executionCtx is used with deferCleanup/withWaitUntil, the Vercel branch where
context is passed, and the Deno branch where info is passed, and replace the
casts by checking/typing those values (e.g., const waitCtx = executionCtx as
WaitUntilContext | null | undefined) or by a type guard, then call
withWaitUntil(waitCtx, ...) and deferCleanup(waitCtx, cleanup) so type safety is
preserved for the functions deferCleanup and withWaitUntil.
- Around line 19-23: The current use of Promise.resolve().then(cleanup) after
context.waitUntil(...) does not guarantee cleanup runs after all
subsequently-registered waitUntil promises; replace the microtask trick with an
explicit synchronization barrier: introduce a deferred "lifetime" Promise and
reference-counting or open/close semantics (e.g., increment a counter when
background work starts and decrement when it finishes) that you resolve when no
more background tasks remain, then call context.waitUntil(lifetimePromise) and
only run cleanup after that lifetimePromise settles; update any code paths that
previously pushed promises via context.waitUntil to instead register against
this shared lifetime barrier so cleanup (the cleanup function and the
___voltagent_wait_until global management) always runs last.
---
Nitpick comments:
In `@packages/serverless-hono/src/utils/defer-cleanup.spec.ts`:
- Around line 76-79: Replace the unsafe cast in the test "should handle context
with non-function waitUntil" by changing the context declaration to use a
double-cast through unknown into the proper WaitUntilContext type (i.e., replace
"as any" with "as unknown as WaitUntilContext") so the test still supplies
waitUntil: "not a function" while preserving TypeScript type-safety; update the
variable named context in defer-cleanup.spec.ts accordingly and keep the rest of
the test (cleanup = vi.fn()) intact.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 6cd775cb-2228-442e-80cc-29f213f94881
📒 Files selected for processing (3)
.changeset/fix-serverless-waituntil-cleanup.mdpackages/serverless-hono/src/serverless-provider.tspackages/serverless-hono/src/utils/defer-cleanup.spec.ts
There was a problem hiding this comment.
1 issue found across 3 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="packages/serverless-hono/src/serverless-provider.ts">
<violation number="1" location="packages/serverless-hono/src/serverless-provider.ts:21">
P2: Cleanup deferral uses an immediate microtask, which can run before all async tool/stream waitUntil usage completes, so global waitUntil may still be cleared too early.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
… in Cloudflare Workers The `finally` block in toCloudflareWorker/toVercelEdge/toDeno calls cleanup() as soon as the Response object is returned — before streaming and tool execution complete. This clears the global ___voltagent_wait_until while tools are still using it, causing crashes. Fix: schedule cleanup through the platform's own waitUntil() so it runs only after all pending promises (streaming, tools, observability exports) have settled. Falls back to synchronous cleanup when waitUntil is unavailable (non-serverless environments). Fixes VoltAgent#1186
33946ed to
1c9a310
Compare
|
Hey @ravyg , |
Summary
The
finallyblock intoCloudflareWorker(),toVercelEdge(), andtoDeno()callscleanup()as soon as the Response object is returned — before streamingand tool execution complete. This clears the global
___voltagent_wait_untilwhile tools are still using it, causing crashes.Fix: schedule cleanup through the platform's own
waitUntil()so it runs only after all pending promises (streaming, tools, observability exports) have settled.Falls back to synchronous cleanup when
waitUntilis unavailable (non-serverless environments).Fixes #1186
PR Checklist
What is the current behavior?
When an agent runs inside Cloudflare Workers (or Vercel Edge / Deno) and calls a tool that takes more than a few seconds, the agent crashes after the
tool-input-availableevent. This happens because:withWaitUntil(executionCtx)stores the platform'swaitUntilfunction in a global (___voltagent_wait_until)app.fetch()finallyblock fires immediately and callscleanup(), which clears the globalWhat is the new behavior?
Cleanup is deferred through the platform's own
waitUntil()mechanism. The newdeferCleanup()helper:waitUntil— schedules cleanup as a microtask through it (runs after streaming/tools/exports finish)waitUntilthrows (response already committed) or isn't available — falls back to synchronous cleanup (same behavior as before)One file changed, one helper function added, applied to all three platform methods (
toCloudflareWorker,toVercelEdge,toDeno). No API changes, fullybackward compatible.
Notes for reviewers
deferCleanup()covering: deferred cleanup, null/undefined/no-waitUntil fallbacks, waitUntil-throws fallback, and non-function waitUntilwait-until-wrapper.spec.tsreproduce on upstreammain— not related to this change@voltagent/resumable-streamstest failure is also pre-existing onmain(vitest config issue)Summary by cubic
Defers cleanup of the serverless
waitUntilcontext in@voltagent/serverless-honoto keep___voltagent_wait_untilalive until all background tasks settle, preventing crashes on Cloudflare Workers, Vercel Edge, and Deno. AddsdeferCleanup()and applies it totoCloudflareWorker,toVercelEdge, andtoDeno. Fixes #1186.___voltagent_wait_untilwith a tracking proxy and schedules cleanup via platformwaitUntil(); handles late-registered promises.waitUntilis missing or throws; no API changes.Written for commit 1c9a310. Summary will update on new commits.
Summary by CodeRabbit
Bug Fixes
Documentation
Tests