Fix httpx proxy usage in sync document downloads#1979
Fix httpx proxy usage in sync document downloads#1979saleweaver wants to merge 2 commits intomasterfrom
Conversation
Reviewer's guide (collapsed on small PRs)Reviewer's GuideUpdates synchronous document download code paths to use the correct Sequence diagram for synchronous document download using httpx.Client with proxysequenceDiagram
actor Caller
participant DataKioskApi
participant HttpxClient
participant DocumentServer
Caller->>DataKioskApi: get_document(download, file, decrypt, kwargs)
DataKioskApi->>DataKioskApi: Evaluate download/file/decrypt flags
alt download or file or decrypt
DataKioskApi->>HttpxClient: Instantiate Client(proxy=self.proxies, verify=self.verify)
HttpxClient->>DocumentServer: GET documentUrl
DocumentServer-->>HttpxClient: document_response
HttpxClient-->>DataKioskApi: document_response
DataKioskApi-->>Caller: Return downloaded/decrypted document
else no download and no decrypt
DataKioskApi-->>Caller: Return metadata only
end
File-Level Changes
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
|
@codex review |
There was a problem hiding this comment.
Hey - I've found 2 issues, and left some high level feedback:
- Confirm that
self.proxiesalways holds a single proxy value rather than a mapping, sincehttpx.Client(proxy=...)expects a single proxy configuration whileproxies=is used for per-scheme mappings. - Consider centralizing
httpx.Clientconstruction (proxy/verify/timeout) in a shared helper to avoid similar parameter mismatches across different sync download paths in the future.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Confirm that `self.proxies` always holds a single proxy value rather than a mapping, since `httpx.Client(proxy=...)` expects a single proxy configuration while `proxies=` is used for per-scheme mappings.
- Consider centralizing `httpx.Client` construction (proxy/verify/timeout) in a shared helper to avoid similar parameter mismatches across different sync download paths in the future.
## Individual Comments
### Comment 1
<location> `sp_api/api/data_kiosk/data_kiosk.py:200` </location>
<code_context>
if download or file or ("decrypt" in kwargs and kwargs["decrypt"]):
with httpx.Client(
- proxies=self.proxies,
+ proxy=self.proxies,
verify=self.verify,
) as client:
</code_context>
<issue_to_address>
**issue (bug_risk):** Using `proxy` instead of `proxies` on `httpx.Client` is likely incorrect and will silently ignore the proxy config.
`httpx.Client` only supports the `proxies` keyword; `proxy` will be ignored. This can break expected proxy routing and related security assumptions. Unless you’re instantiating a different client that explicitly accepts `proxy`, this should stay as `proxies=self.proxies`.
</issue_to_address>
### Comment 2
<location> `sp_api/api/reports/reports.py:370` </location>
<code_context>
if download or file or ("decrypt" in kwargs and kwargs["decrypt"]):
with httpx.Client(
- proxies=self.proxies,
+ proxy=self.proxies,
verify=self.verify,
) as client:
</code_context>
<issue_to_address>
**issue (bug_risk):** Same `proxy` vs `proxies` issue on `httpx.Client`, likely disabling proxy usage for report downloads.
`httpx.Client` expects `proxies`, not `proxy`, so this change will prevent the configured proxies from being used for report downloads. Unless there’s a custom wrapper that requires `proxy`, this should remain `proxies=self.proxies` to preserve existing behavior.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| if download or file or ("decrypt" in kwargs and kwargs["decrypt"]): | ||
| with httpx.Client( | ||
| proxies=self.proxies, | ||
| proxy=self.proxies, |
There was a problem hiding this comment.
issue (bug_risk): Using proxy instead of proxies on httpx.Client is likely incorrect and will silently ignore the proxy config.
httpx.Client only supports the proxies keyword; proxy will be ignored. This can break expected proxy routing and related security assumptions. Unless you’re instantiating a different client that explicitly accepts proxy, this should stay as proxies=self.proxies.
| compression_algorithm = res.payload.get("compressionAlgorithm") | ||
| with httpx.Client( | ||
| proxies=self.proxies, | ||
| proxy=self.proxies, |
There was a problem hiding this comment.
issue (bug_risk): Same proxy vs proxies issue on httpx.Client, likely disabling proxy usage for report downloads.
httpx.Client expects proxies, not proxy, so this change will prevent the configured proxies from being used for report downloads. Unless there’s a custom wrapper that requires proxy, this should remain proxies=self.proxies to preserve existing behavior.
|
Codex Review: Didn't find any major issues. Keep it up! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
Motivation
httpx.Clientwith the unsupportedproxieskeyword in synchronous document download paths after the migration tohttpx.Description
proxies=self.proxieswithproxy=self.proxiesinsp_api/api/reports/reports.pyandsp_api/api/data_kiosk/data_kiosk.pyso synchronous downloads usehttpx's correct parameter name.Testing
Codex Task
Summary by Sourcery
Bug Fixes: