-
Couldn't load subscription status.
- Fork 43
Add HTTP retry logic with exponential backoff for network resiliency against Zscaler and proxy interference #449
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Co-authored-by: deacon-mp <61169193+deacon-mp@users.noreply.github.com>
…iency Co-authored-by: deacon-mp <61169193+deacon-mp@users.noreply.github.com>
Co-authored-by: deacon-mp <61169193+deacon-mp@users.noreply.github.com>
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements comprehensive HTTP retry logic with exponential backoff to resolve network resiliency issues with the sandcat agent when running in environments with Zscaler or similar network interference tools.
Key changes include:
- Addition of retry configuration with maximum 3 attempts and exponential backoff (2s → 4s → 8s with jitter)
- Smart error classification to distinguish between retryable and non-retryable errors
- Implementation of retry logic across all HTTP operations (beacon, payload downloads, file uploads, execution results)
Reviewed Changes
Copilot reviewed 2 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| gocat/contact/api.go | Core implementation of HTTP retry logic with exponential backoff, error classification functions, and retry loops for all HTTP operations |
| gocat/contact/contact_test.go | Test cases for retry functionality including status code classification, error handling, and delay calculation |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| requestBody := bytes.Buffer{} | ||
| contentType, err := createUploadForm(&requestBody, data, uploadName) | ||
| if err != nil { | ||
| return err |
Copilot
AI
Oct 6, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line was changed from return nil to return err, but the function signature indicates it should return an error. However, the original return nil suggests this might have been intended to return a nil error on success, not the error itself.
| delay = maxRetryDelay | ||
| } | ||
| // Add jitter to prevent thundering herd | ||
| jitter := time.Duration(rand.Intn(1000)) * time.Millisecond |
Copilot
AI
Oct 6, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using math/rand without seeding can produce predictable sequences. Consider using crypto/rand for better randomness or seed math/rand with rand.Seed(time.Now().UnixNano()) to ensure different jitter patterns across agent instances.


Problem
When the sandcat agent runs on devices with Zscaler ZIA or similar network interference tools, occasional network errors cause operations to freeze indefinitely without retry or failure reporting. This manifests as:
Failed to decode HTTP response: illegal base64 data at input byte 0Failed to perform HTTP request: Post "https://xxxxx/beacon": read tcp 10.XX.XXX.XXX:58794->XXX.XXX.XX.X:443: wsarecv: A connection attempt failed...Solution
This PR implements comprehensive HTTP retry logic with exponential backoff to handle temporary network failures gracefully.
Key Changes
1. Retry Configuration
2. Smart Error Classification
3. Comprehensive Coverage
All HTTP operations now include retry logic:
GetBeaconBytes)GetPayloadBytes)UploadFileBytes)SendExecutionResults)4. Enhanced Logging
[!] HTTP request failed (attempt 1/4): connection refused. Retrying in 2.322s[+] HTTP request succeeded on attempt 3Testing
The implementation has been thoroughly tested with a mock server that simulates network failures:
Impact
This resolves the core issue where network instability would cause sandcat operations to freeze indefinitely, improving overall agent reliability in enterprise environments with network security tools.
Warning
Firewall rules blocked me from connecting to one or more addresses (expand for details)
I tried to connect to the following addresses, but was blocked by firewall rules:
224.0.0.251If you need me to access, download, or install something from one of these locations, you can either:
Original prompt
💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.