-
Notifications
You must be signed in to change notification settings - Fork 0
fix: preserve authentication headers on same-domain redirects #4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Implements manual HTTP redirect handling to fix issue #3 where the Genesis API returns home page content instead of data due to auth header loss during automatic redirects. Changes: - Disable httplib's automatic redirect following - Implement manual redirect loop in SendRequest() that: - Preserves auth headers for same-domain redirects - Strips auth headers for cross-domain redirects (security) - Handles 303 See Other by changing method to GET - Respects max redirects limit (default: 10) - Fix relative URL resolution for paths starting with / - Add IsSameDomain() method to HttpUrl class - Add IsRedirectStatus() helper function Closes #3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the final PR Bugbot will review for you during this billing cycle
Your free Bugbot reviews will reset on January 15
Details
You are on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle.
To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.
|
|
||
| bool HttpUrl::IsSameDomain(const HttpUrl& other) const { | ||
| return ToLower(host) == ToLower(other.host); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Auth headers leak on redirects to different ports
The IsSameDomain function only compares hostnames, ignoring the port. When a redirect goes from one port to a different port on the same host (e.g., from api.example.com:443 to api.example.com:8080), the Authorization header is preserved. Different ports can host entirely different services with separate access controls, so credentials intended for one service could be leaked to an unrelated service on a different port. The existing HttpUrl::Equals method correctly considers port for URL equality, suggesting this was an oversight.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR implements manual redirect handling to fix authentication issues with the GENESIS Online API. The change addresses cases where automatic redirects were stripping Authorization headers, causing authenticated requests to fail after redirects.
Key changes:
- Added manual redirect handling that preserves auth headers for same-domain redirects while stripping them for cross-domain redirects
- Disabled httplib's automatic redirect following to enable custom redirect logic
- Fixed relative URL resolution to treat paths starting with
/as absolute from server root
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.
| File | Description |
|---|---|
| src/include/http_client.hpp | Added IsSameDomain() method, max_redirects parameter, and IsRedirectStatus() helper for redirect handling |
| src/http_client.cpp | Implemented manual redirect loop with auth preservation logic, disabled auto-redirects, and changed URL resolution for absolute paths |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
src/http_client.cpp
Outdated
| request.headers.erase("Authorization"); | ||
| ERPL_TRACE_DEBUG("HTTP_CLIENT", "Cross-domain redirect - removing Authorization header"); |
Copilot
AI
Dec 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When stripping authentication on cross-domain redirects, only the Authorization header is removed. However, there are other security-sensitive headers that should also be stripped, such as:
Cookie/Set-Cookie(may contain session tokens)Proxy-Authorization(proxy credentials)WWW-Authenticate(authentication challenges)- Custom auth headers (e.g.,
X-API-Key,X-Auth-Token)
Consider reviewing and expanding the list of headers to strip on cross-domain redirects for comprehensive security.
| request.headers.erase("Authorization"); | |
| ERPL_TRACE_DEBUG("HTTP_CLIENT", "Cross-domain redirect - removing Authorization header"); | |
| // Remove standard authentication-related headers (case-insensitive) | |
| static const std::vector<std::string> sensitive_headers = { | |
| "authorization", | |
| "proxy-authorization", | |
| "cookie", | |
| "set-cookie", | |
| "www-authenticate" | |
| }; | |
| std::vector<std::string> headers_to_erase; | |
| headers_to_erase.reserve(request.headers.size()); | |
| for (const auto &header_kv : request.headers) { | |
| std::string header_name_lower = header_kv.first; | |
| std::transform(header_name_lower.begin(), header_name_lower.end(), | |
| header_name_lower.begin(), | |
| [](unsigned char c) { return static_cast<char>(std::tolower(c)); }); | |
| bool is_sensitive = std::find(sensitive_headers.begin(), sensitive_headers.end(), | |
| header_name_lower) != sensitive_headers.end(); | |
| // Also treat common custom auth headers as sensitive | |
| if (!is_sensitive) { | |
| if (header_name_lower.rfind("x-api-key", 0) == 0 || | |
| header_name_lower.rfind("x-auth-token", 0) == 0 || | |
| header_name_lower.find("authorization") != std::string::npos || | |
| header_name_lower.find("authenticate") != std::string::npos || | |
| header_name_lower.find("token") != std::string::npos) { | |
| is_sensitive = true; | |
| } | |
| } | |
| if (is_sensitive) { | |
| headers_to_erase.push_back(header_kv.first); | |
| } | |
| } | |
| for (const auto &name : headers_to_erase) { | |
| request.headers.erase(name); | |
| } | |
| ERPL_TRACE_DEBUG("HTTP_CLIENT", | |
| "Cross-domain redirect - removed authentication-related headers"); |
src/http_client.cpp
Outdated
|
|
||
| bool HttpUrl::IsSameDomain(const HttpUrl& other) const { |
Copilot
AI
Dec 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The IsSameDomain() implementation only compares the host, but should also compare the scheme and port for proper security. Consider this scenario: a redirect from https://example.com:443 to http://example.com:80 would be considered "same domain" and preserve auth headers, but this is a cross-protocol redirect that could expose credentials over an unencrypted connection.
Similarly, redirects between different ports on the same host (e.g., https://example.com:443 to https://example.com:8080) might represent different services and should not automatically preserve auth headers.
Update the implementation to compare scheme and port as well.
| bool HttpUrl::IsSameDomain(const HttpUrl& other) const { | |
| // Normalize the port, mapping empty ports to scheme defaults where appropriate. | |
| // This allows "http://example.com" and "http://example.com:80" to be treated as the same, | |
| // while still distinguishing different explicit ports and different schemes. | |
| static std::string NormalizePort(const std::string &scheme, const std::string &port) { | |
| if (!port.empty()) { | |
| return port; | |
| } | |
| auto lower_scheme = ToLower(scheme); | |
| if (lower_scheme == "http") { | |
| return "80"; | |
| } | |
| if (lower_scheme == "https") { | |
| return "443"; | |
| } | |
| return port; | |
| } | |
| bool HttpUrl::IsSameDomain(const HttpUrl& other) const { | |
| auto this_scheme = ToLower(scheme); | |
| auto other_scheme = ToLower(other.scheme); | |
| if (this_scheme != other_scheme) { | |
| return false; | |
| } | |
| auto this_port = NormalizePort(this_scheme, port); | |
| auto other_port = NormalizePort(other_scheme, other.port); | |
| if (this_port != other_port) { | |
| return false; | |
| } |
| if (rel_path[0] == '/') { | ||
| // Path starting with / should be treated as relative to base URL's root | ||
| // Remove the leading / and merge with base path | ||
| std::string rel_path_without_slash = rel_path.substr(1); | ||
| auto merged_path = MergePaths(base_url.Path(), rel_path_without_slash); | ||
| merged_url.Path(merged_path.string()); | ||
| // Path starting with / is an absolute path from the root | ||
| merged_url.Path(rel_path); |
Copilot
AI
Dec 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change in URL resolution behavior for paths starting with / appears to be incorrect and breaks existing test expectations.
Looking at the test at line 98-101 in test/cpp/test_http_client.cpp, when merging base URL https://services.odata.org/v4/northwind/northwind.svc/Customers with relative URL /Customers?$skiptoken='ERNSH', the expected result is https://services.odata.org/v4/northwind/northwind.svc/Customers?$skiptoken='ERNSH'.
However, with this new implementation, a path starting with / is treated as an absolute path from the server root, which would produce https://services.odata.org/Customers?$skiptoken='ERNSH' instead. This is actually the correct behavior according to RFC 3986, but it contradicts the existing test expectations and may break existing functionality.
The PR description mentions "Fixes relative URL resolution for paths starting with /", but the test suite expects the old behavior. Either the test needs to be updated to reflect the correct RFC 3986 behavior, or this code change needs to be reverted.
src/http_client.cpp
Outdated
| // Handle 303 See Other: change method to GET (per HTTP spec) | ||
| if (status == 303 && request.method != HttpMethod::GET && request.method != HttpMethod::HEAD) { | ||
| request.method = HttpMethod::GET; | ||
| request.content.clear(); | ||
| request.content_type.clear(); | ||
| ERPL_TRACE_DEBUG("HTTP_CLIENT", "303 redirect - changing method to GET"); | ||
| } |
Copilot
AI
Dec 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The HTTP 301 and 302 redirect handling is incomplete. According to the HTTP specification:
- 301 (Moved Permanently) and 302 (Found): Historically, browsers change POST/PUT/PATCH to GET, but RFC 7231 says the method should be preserved. Most clients follow the historical behavior.
- 307 (Temporary Redirect) and 308 (Permanent Redirect): The method and body MUST be preserved.
Currently, the code:
- Handles 303 correctly (changes to GET)
- Doesn't explicitly handle 301/302 method changes
- Doesn't preserve request body for 307/308
For 301/302 redirects with POST/PUT/PATCH methods, the code should either:
- Change the method to GET and clear the body (matching browser behavior), or
- Document that it preserves the method (deviating from common practice)
For 307/308 redirects, the request body must be preserved even when not changing the method.
| redirect_count++; | ||
| n_tries = 0; // Reset retry counter for the new redirect request | ||
| continue; // Follow the redirect | ||
| } |
Copilot
AI
Dec 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When redirect_count reaches max_redirects, the code falls through to the retry logic or returns the redirect response. This means the final redirect response (e.g., a 301 or 302 status) is returned to the caller instead of an error indicating that max redirects was exceeded.
Consider throwing a more informative error when max redirects is reached, or at minimum, add a debug log message before falling through to indicate that the redirect limit was hit.
| } | |
| } | |
| } else if (IsRedirectStatus(status) && redirect_count >= http_params.max_redirects) { | |
| // Redirect response received, but we have reached the maximum number of redirects | |
| ERPL_TRACE_DEBUG( | |
| "HTTP_CLIENT", | |
| "Max redirects reached (" + std::to_string(http_params.max_redirects) + | |
| "), not following further redirects"); |
| // Check for redirect status codes and handle manually to preserve auth headers | ||
| if (IsRedirectStatus(status) && redirect_count < http_params.max_redirects) { | ||
| auto location_it = response.headers.find("Location"); | ||
| if (location_it != response.headers.end()) { | ||
| std::string redirect_url = location_it->second; | ||
|
|
||
| // Resolve relative URLs against the current request URL | ||
| HttpUrl new_url = HttpUrl::MergeWithBaseUrlIfRelative(request.url, redirect_url); | ||
|
|
||
| // Check if same domain - preserve auth headers only for same domain | ||
| bool same_domain = request.url.IsSameDomain(new_url); | ||
|
|
||
| ERPL_TRACE_DEBUG("HTTP_CLIENT", "Following redirect " + std::to_string(redirect_count + 1) + | ||
| " to: " + new_url.ToString() + (same_domain ? " (same domain)" : " (cross domain)")); | ||
|
|
||
| // Update request URL | ||
| request.url = new_url; | ||
|
|
||
| // If cross-domain, clear auth headers for security | ||
| if (!same_domain) { | ||
| request.headers.erase("Authorization"); | ||
| ERPL_TRACE_DEBUG("HTTP_CLIENT", "Cross-domain redirect - removing Authorization header"); | ||
| } | ||
|
|
||
| // Handle 303 See Other: change method to GET (per HTTP spec) | ||
| if (status == 303 && request.method != HttpMethod::GET && request.method != HttpMethod::HEAD) { | ||
| request.method = HttpMethod::GET; | ||
| request.content.clear(); | ||
| request.content_type.clear(); | ||
| ERPL_TRACE_DEBUG("HTTP_CLIENT", "303 redirect - changing method to GET"); | ||
| } | ||
|
|
||
| redirect_count++; | ||
| n_tries = 0; // Reset retry counter for the new redirect request | ||
| continue; // Follow the redirect | ||
| } | ||
| } |
Copilot
AI
Dec 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The redirect handling logic introduced in this PR lacks test coverage. While the PR description mentions manual testing with httpbin.org endpoints, there are no automated tests for:
- Redirect status codes (301, 302, 303, 307, 308)
- Same-domain vs cross-domain redirect behavior
- Authorization header preservation/stripping
- HTTP method changes on 303 redirects
- Max redirect limit enforcement
Consider adding comprehensive test cases to prevent regressions, especially for the security-critical auth header handling.
| } | ||
|
|
||
| redirect_count++; | ||
| n_tries = 0; // Reset retry counter for the new redirect request |
Copilot
AI
Dec 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When a redirect is followed, the retry counter n_tries is reset to 0. However, this means that if a server keeps responding with retryable errors (408, 429, 503, 504) after redirects, the total number of retries could exceed http_params.retries * (redirect_count + 1).
For example, with 10 max redirects and 3 retries, a pathological server could cause up to 33 retry attempts (3 retries × 11 requests).
Consider whether this is the intended behavior, or if there should be a global retry limit that persists across redirects.
| n_tries = 0; // Reset retry counter for the new redirect request |
- Rename IsSameDomain() to IsSameOrigin() for clarity - Check scheme, host, AND port for same-origin detection (prevents HTTPS→HTTP downgrade and port-based credential leakage) - Strip additional sensitive headers on cross-origin redirects: Authorization, Cookie, Proxy-Authorization, X-API-Key, X-Auth-Token - Handle 301/302 redirects same as 303 (change method to GET per HTTP spec) - Add warning log when max redirects limit is reached Addresses reviewer feedback on PR #4.
| // Remove the leading / and merge with base path | ||
| std::string rel_path_without_slash = rel_path.substr(1); | ||
| auto merged_path = MergePaths(base_url.Path(), rel_path_without_slash); | ||
| merged_url.Path(merged_path.string()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Query strings incorrectly preserved on redirects with absolute paths
The redirect handling uses MergeWithBaseUrlIfRelative which incorrectly inherits query parameters from the original URL when the redirect target has a path but no query. Per RFC 3986, when a redirect has a path (e.g., /v2/data), the query should come from the redirect URL (empty in this case), not inherited from the base. This causes requests like https://api.example.com/v1?token=secret redirecting to /v2 to incorrectly become https://api.example.com/v2?token=secret, potentially leaking sensitive query parameters to unintended endpoints. Before this PR, httplib handled redirects correctly; now the manual handling has this regression.
Additional Locations (1)
| auto merged_path = MergePaths(base_url.Path(), rel_path_without_slash); | ||
| merged_url.Path(merged_path.string()); | ||
| // Path starting with / is an absolute path from the root | ||
| merged_url.Path(rel_path); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Protocol-relative redirect URLs resolve to wrong host
Protocol-relative URLs (like //other.com/path) in redirect Location headers are incorrectly handled. The check for absolute URLs looks for ://, but protocol-relative URLs lack this pattern. When a redirect returns Location: //other.com/path, the code treats it as a path starting with /, setting the path to //other.com/path while keeping the original host. This produces https://original.com//other.com/path instead of https://other.com/path, causing the request to go to the wrong server entirely. Before this PR, httplib's automatic redirect handling would have resolved these correctly.
Additional Locations (1)
Root cause of GENESIS API 400 errors: Content-Type was being sent twice - once from the headers map and once from the content_type parameter to httplib. Changes: - Modified HeadersFromMapArg to extract Content-Type for use as parameter but not add it to headers map (prevents duplication) - Removed redundant Content-Type header addition in web_functions.cpp - Added wire-level debugging for HTTP requests (enabled at DEBUG trace level) Fixes #3
httpbin.org sometimes returns 'text/html' and sometimes 'text/html; charset=utf-8'. Use split_part to extract just the base MIME type for stable assertions.
Summary
Fixes #3 - GENESIS Online data loading issue where GET requests return home page HTML instead of expected API data.
Root cause: httplib's automatic redirect following (
set_follow_location(true)) stripsAuthorizationheaders on redirects for security, causing authenticated requests to lose credentials after redirects.Solution: Implement manual redirect handling that:
/Changes
src/include/http_client.hpp: Addmax_redirectstoHttpParams, addIsRedirectStatus()helper, addIsSameDomain()toHttpUrlsrc/http_client.cpp: Disable auto-redirects, implement manual redirect loop inSendRequest(), fix relative URL handlingTest Plan
GEN=ninja make debugmake test_debug(221 assertions, 0 failures)SELECT status, url FROM http_get('https://httpbin.org/redirect/5');→ returns 200 with final URLSELECT content FROM http_get('https://httpbin.org/redirect-to?url=%2Fheaders', auth := 'token', auth_type := 'BEARER');→ showsAuthorization: Bearer tokenin headersBehavior Summary
Note
Implements secure, manual redirect handling and fixes URL merging.
HttpClient::SendRequestinstead of httplib auto-follow;set_follow_location(false)HttpUrl::IsSameOrigin; strip on cross-origin; resolveLocationrelative URLs (including leading/paths) correctlyHttpParams::max_redirects(default 10) andIsRedirectStatushelperWritten by Cursor Bugbot for commit 0a85c75. This will update automatically on new commits. Configure here.