feat: implement rate limiting retry mechanism for stat forwarder #16330

forrestIsRunning · 2026-01-10T02:27:57Z

Replace fixed delay retry with Kubernetes workqueue rate limiter

Add TypedItemFastSlowRateLimiter for adaptive retry delays
Implement fast/slow retry strategy (100ms for first 5, 500ms after)
Add automatic state tracking and cleanup
Update maybeRetry() to use rate limiter instead of fixed sleep
Clean up rate limiter state on successful processing

Resolves TODO: Use RateLimitingInterface and NewItemFastSlowRateLimiter

Fixes #

Proposed Changes

Replace fixed 500ms delay with workqueue.TypedItemFastSlowRateLimiter for adaptive retries
Add rateLimiter field to Forwarder to track retry state per stat
Update maybeRetry() to use rateLimiter.When() and rateLimiter.NumRequeues()
Clean up rate limiter state on success and max retries exceeded
Add configurable constants: fastRetryDelay (100ms), slowRetryDelay (500ms), maxFastRetryAttempts (5)

Why these changes?
Fixed delay doesn't adapt to failure scenarios. Fast retries help transient errors recover quickly, while slow retries reduce load for persistent issues. Uses Kubernetes standard workqueue rate limiting.

Release Note
elease-note
feat: implement rate limiting retry mechanism for stat forwarder

Replace fixed delay retry with Kubernetes workqueue rate limiter. Uses fast retries (100ms) for first 5 attempts and slower retries (500ms) after, improving error recovery while reducing system load.

Replace fixed delay retry with Kubernetes workqueue rate limiter - Add TypedItemFastSlowRateLimiter for adaptive retry delays - Implement fast/slow retry strategy (100ms for first 5, 500ms after) - Add automatic state tracking and cleanup - Update maybeRetry() to use rate limiter instead of fixed sleep - Clean up rate limiter state on successful processing Resolves TODO: Use RateLimitingInterface and NewItemFastSlowRateLimiter

linux-foundation-easycla · 2026-01-10T02:28:02Z

❌ The email address for the commit (8e630e1) is not linked to the GitHub account, preventing the EasyCLA check. Consult this Help Article and GitHub Help to resolve. (To view the commit's email address, add .patch at the end of this PR page's URL.) For further assistance with EasyCLA, please submit a support request ticket.

knative-prow · 2026-01-10T02:28:05Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: forrestIsRunning
Once this PR has been reviewed and has the lgtm label, please assign dprotaso for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

knative-prow · 2026-01-10T02:28:06Z

Welcome @forrestIsRunning! It looks like this is your first PR to knative/serving 🎉

knative-prow · 2026-01-10T02:28:07Z

Hi @forrestIsRunning. Thanks for your PR.

I'm waiting for a github.com member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

dprotaso · 2026-01-21T18:47:25Z

/ok-to-test

codecov · 2026-01-21T18:53:59Z

Codecov Report

❌ Patch coverage is 77.77778% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 80.24%. Comparing base (09e59ec) to head (8e630e1).
⚠️ Report is 27 commits behind head on main.

Files with missing lines	Patch %	Lines
pkg/autoscaler/statforwarder/forwarder.go	77.77%	3 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main   #16330      +/-   ##
==========================================
+ Coverage   80.11%   80.24%   +0.12%     
==========================================
  Files         215      216       +1     
  Lines       13391    13443      +52     
==========================================
+ Hits        10728    10787      +59     
+ Misses       2304     2293      -11     
- Partials      359      363       +4

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

dprotaso

Add rateLimiter field to Forwarder to track retry state per stat

The original retry mechanism has an individual delay per each stat.

Using the workqueue with the key being the revision means that all retried stats of the same revision will effect each other - specifically 5 delayed stats from the same revision will mean the 6th retied stat will be have to wait 500ms.

I think we can drop the workqueue as part of the implementation.

I wonder if it's just simpler to use the s.retry count and adjust the sleep based on that value (without the queue).

Let me know what you think and whether I interpreted your change incorrectly

dprotaso · 2026-01-21T18:59:05Z

Also note to contribute to the project you will need to sign the CLA (EasyCLA) which is a LinuxFoundation thing

dprotaso · 2026-01-21T18:59:12Z

/hold

knative-prow bot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jan 10, 2026

knative-prow bot requested review from dprotaso and skonto January 10, 2026 02:28

knative-prow bot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jan 21, 2026

dprotaso reviewed Jan 21, 2026

View reviewed changes

knative-prow bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 21, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: implement rate limiting retry mechanism for stat forwarder #16330

feat: implement rate limiting retry mechanism for stat forwarder #16330

Uh oh!

forrestIsRunning commented Jan 10, 2026

Uh oh!

linux-foundation-easycla bot commented Jan 10, 2026 •

edited

Loading

Uh oh!

knative-prow bot commented Jan 10, 2026

Uh oh!

knative-prow bot commented Jan 10, 2026

Uh oh!

knative-prow bot commented Jan 10, 2026

Uh oh!

dprotaso commented Jan 21, 2026

Uh oh!

codecov bot commented Jan 21, 2026

Uh oh!

dprotaso left a comment

Uh oh!

dprotaso commented Jan 21, 2026

Uh oh!

dprotaso commented Jan 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: implement rate limiting retry mechanism for stat forwarder #16330

Are you sure you want to change the base?

feat: implement rate limiting retry mechanism for stat forwarder #16330

Uh oh!

Conversation

forrestIsRunning commented Jan 10, 2026

Proposed Changes

Uh oh!

linux-foundation-easycla bot commented Jan 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

knative-prow bot commented Jan 10, 2026

Uh oh!

knative-prow bot commented Jan 10, 2026

Uh oh!

knative-prow bot commented Jan 10, 2026

Uh oh!

dprotaso commented Jan 21, 2026

Uh oh!

codecov bot commented Jan 21, 2026

Codecov Report

Uh oh!

dprotaso left a comment

Choose a reason for hiding this comment

Uh oh!

dprotaso commented Jan 21, 2026

Uh oh!

dprotaso commented Jan 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

linux-foundation-easycla bot commented Jan 10, 2026 •

edited

Loading