[Proposal] Add Policy Lifecycle Close Method for Resource Cleanup #734
renuka-fernando
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Summary
Add a
Close()method to the Policy interface to enable proper cleanup of resources (database connections, goroutines, HTTP clients) when policies are removed or replaced. This prevents resource leaks that occur when APIs are deleted or updated, as the current implementation simply replaces the policy map without any cleanup mechanism.Motivation
Problem Statement
When APIs are deleted or updated via xDS, the
ApplyWholeRoutesmethod replaces the entire routes map without calling any cleanup on existing policy instances:Policies like
SemanticCachePolicycreate external resources during initialization:VectorDBProviderClose()methods that are never calledOver time, this causes:
Who Benefits
Why Now
SemanticCachePolicyand rate limiting policies already create connections that leakDetailed Design
Overview
Add
Close() erroras a required method on thePolicyinterface. The Kernel will callClose()on all policies in a chain when that route is removed or replaced. Policies without resources simply returnnil. This is a breaking change that forces all policy developers to explicitly consider resource cleanup, preventing leaks by design.Changes Required
sdk/gateway/policy/v1alpha/interface.goClose() errorto Policy interfaceinternal/kernel/mapper.goApplyWholeRoutesto callClose()on removed policiesinternal/kernel/mapper.goUnregisterRouteto callClose()policies/*/Close()- returnnilif no resources to cleanuppolicies/semantic-cache/Close()to release vector store and embedding providerpolicies/advanced-ratelimit/Close()for Redis-backed limitersAPI Changes
Breaking Change: All existing policies must implement
Close() error. Policies without resources simply returnnil.Configuration Changes
Examples
Before:
After:
Kernel cleanup (async with timeout):
Drawbacks
Close() error { return nil }Alternatives Considered
Alternative 1: Separate Optional Closer Interface
Closerinterface; Kernel checks if policy implements it via type assertionAlternative 2: Finalizers / Runtime Finalizers
runtime.SetFinalizerto cleanup when policies are garbage collectedAlternative 3: Global Resource Pool with Reference Counting
Alternative 4: Context Cancellation
redisClient.Close()must still be called explicitly; context cancellation doesn't trigger itctx.Done()Close()to release the connection. SinceClose()is required anyway, it's simpler to handle goroutine shutdown there too:Compatibility
Close() errorMigration Steps
Close()method in Policy interfaceClose() errorClose()on policy replacementDesign Decisions
The following questions have been resolved:
policy.closeTimeoutKernel.Shutdown(ctx)waits for pending Close() callsMetrics to expose:
policy_close_totalstatus=[success|error|timeout]policy_close_duration_secondspolicy_close_queue_sizeWorker pool design:
policy.closeWorkers)Unresolved Questions
None - proposal is fully specified.
Definition of Done
SDK Changes:
Close() errormethod added toPolicyinterfaceKernel Changes:
ApplyWholeRoutescallsClose()asynchronously on removed policiesUnregisterRoutecallsClose()on removed policy chainpolicy.closeTimeout)policy.closeWorkers)Kernel.Shutdown(ctx)method for graceful shutdownpolicy_close_total,policy_close_duration_seconds,policy_close_queue_sizePolicy Updates:
Close() errorSemanticCachePolicycloses vector store and embedding providerTesting:
Documentation:
References
Beta Was this translation helpful? Give feedback.
All reactions